background image

Front Panel Components

Front Panel Components

No.

Description

Links

1

Control panel

“Detecting and Managing Faults” on page 25

“Preparing for Service” on page 43

“Returning the Server to Operation” on page 191

2

Processor modules (slots 0 and

1) or processor filler module

(slot 1 only)

“Processor Module Components” on page 18

“Servicing Processor Modules” on page 55

3

Main module

“Main Module Components” on page 20

“Servicing the Main Module” on page 97

4

Power supplies (4)

“Servicing Power Supplies” on page 135

14

SPARC T8-4 Server Service Manual • January 2022

Summary of Contents for SPARC T8-4

Page 1: ...SPARC T8 4 Server Service Manual Part No E80512 05 January 2022 ...

Page 2: ......

Page 3: ...ay create a risk of personal injury If you use this software or hardware in dangerous applications then you shall be responsible to take all appropriate fail safe backup redundancy and other measures to ensure its safe use Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications Oracle and Java are registered ...

Page 4: ... the U S Government Ce logiciel ou matériel a été développé pour un usage général dans le cadre d applications de gestion des informations Ce logiciel ou matériel n est pas conçu ni n est destiné à être utilisé dans des applications à risque notamment dans des applications pouvant causer un risque de dommages corporels Si vous utilisez ce logiciel ou matériel dans le cadre d applications dangereus...

Page 5: ...d Storage and Backup Devices 21 Component Service Task Reference 22 Detecting and Managing Faults 25 Understanding Diagnostics 25 PSH Overview 25 Diagnostics Process 26 Checking for Faults 27 Interpreting LEDs 27 Log In to Oracle ILOM Service 32 Check for Faults 33 Interpreting Log Files and System Messages 35 Check the Message Buffer 35 View Log Files Oracle Solaris 36 View Log Files Oracle ILOM ...

Page 6: ...Off the Server 51 Power Off the Server Oracle ILOM 51 Power Off the Server Power Button Graceful Shutdown 52 Power Off the Server Power Button Emergency Shutdown 52 Disconnect Power Cords 53 Attachment of Devices During Service 54 Servicing Processor Modules 55 Server Upgrade Process 56 Processor Module Configuration 57 Processor Module LEDs 58 Determine Which Processor Module Is Faulty 59 Remove ...

Page 7: ...ve Configuration 87 Hard Drive LEDs 89 Determine Which Hard Drive Is Faulty 90 Remove a Hard Drive 90 Install a Hard Drive 93 Verify a Hard Drive 94 Servicing the Main Module 97 Main Module LEDs 98 Determine if the Main Module Is Faulty 99 Remove the Main Module 99 Install the Main Module 102 Verify the Main Module 105 Servicing NVMe Switch Cards 107 Disconnect the NVMe Cables 108 Remove a NVMe Sw...

Page 8: ...er Supply Is Faulty 138 Remove a Power Supply 139 Install a Power Supply 142 Verify a Power Supply 144 Servicing Fan Modules 145 Fan Module Configuration 146 Fan Module LED 147 Determine Which Fan Module Is Faulty 147 Remove a Fan Module 148 Install a Fan Module 151 Verify a Fan Module 152 Servicing PCIe Cards 155 Understanding PCIe Root Complex Connections 155 PCIe Card Configuration 158 PCIe Car...

Page 9: ...ear I O Module 176 Install the Rear I O Module 178 Verify the Rear I O Module 181 Servicing the Rear Chassis Subassembly 183 Rear Chassis Subassembly Components 183 Remove the Rear Chassis Subassembly 184 Install the Rear Chassis Subassembly 187 Verify the Rear Chassis Subassembly 188 Returning the Server to Operation 191 Connect Power Cords 191 Power On the Server Oracle ILOM 192 Glossary 193 Ind...

Page 10: ...10 SPARC T8 4 Server Service Manual January 2022 ...

Page 11: ...technicians and authorized service personnel who have been instructed on the hazards within the equipment and are qualified to remove and replace hardware Product Documentation Library Documentation and resources for this product and related products are available at http www oracle com goto t8 4 docs Feedback Provide feedback about this documentation at http www oracle com goto docfeedback Using ...

Page 12: ...12 SPARC T8 4 Server Service Manual January 2022 ...

Page 13: ... Rear Panel Components on page 15 Chassis Subassembly Components on page 17 Processor Module Components on page 18 Main Module Components on page 20 Supported Storage and Backup Devices on page 21 Component Service Task Reference on page 22 System Schematic on page 40 Related Information Detecting and Managing Faults Preparing for Service Returning the Server to Operation Identifying Components 13...

Page 14: ...Server to Operation on page 191 2 Processor modules slots 0 and 1 or processor filler module slot 1 only Processor Module Components on page 18 Servicing Processor Modules on page 55 3 Main module Main Module Components on page 20 Servicing the Main Module on page 97 4 Power supplies 4 Servicing Power Supplies on page 135 14 SPARC T8 4 Server Service Manual January 2022 ...

Page 15: ... 17 Processor Module Components on page 18 Main Module Components on page 20 Supported Storage and Backup Devices on page 21 Component Service Task Reference on page 22 System Schematic on page 40 Rear Panel Components No Description Links 1 Fan modules 5 Servicing Fan Modules on page 145 Identifying Components 15 ...

Page 16: ...ponents from the rear of the server No Description Links 1 Chassis 2 Midplane assembly Servicing the Rear Chassis Subassembly on page 183 3 Rear chassis subassembly Servicing the Rear Chassis Subassembly on page 183 Related Information Front Panel Components on page 14 Chassis Subassembly Components on page 17 Processor Module Components on page 18 Main Module Components on page 20 Supported Stora...

Page 17: ...n page 29 4 Processor modules 2 Servicing Processor Modules on page 55 5 Chassis 6 Rear chassis subassembly RCSA Servicing the Rear Chassis Subassembly on page 183 7 Fan modules 5 Servicing Fan Modules on page 145 8 PCIe carriers 16 Servicing PCIe Cards on page 155 9 Rear I O module Servicing the Rear I O Module on page 173 10 Power supplies 4 Servicing Power Supplies on page 135 11 Hard drives 8 ...

Page 18: ...20 Supported Storage and Backup Devices on page 21 Component Service Task Reference on page 22 System Schematic on page 40 Processor Module Components These components are accessible within the processor module when you remove the processor module from the front of the server Note The processor modules are located beneath the heat sinks 18 SPARC T8 4 Server Service Manual January 2022 ...

Page 19: ...ription Link 1 DIMMs Servicing DIMMs on page 69 Related Information Front Panel Components on page 14 Rear Panel Components on page 15 Chassis Subassembly Components on page 17 Main Module Components on page 20 Identifying Components 19 ...

Page 20: ...e Task Reference on page 22 System Schematic on page 40 Main Module Components These components are accessible after you remove the main module from the front of the server No Description Links 1 Hard drives Servicing Hard Drives on page 87 20 SPARC T8 4 Server Service Manual January 2022 ...

Page 21: ...ont Panel Components on page 14 Rear Panel Components on page 15 Chassis Subassembly Components on page 17 Processor Module Components on page 18 Supported Storage and Backup Devices on page 21 Component Service Task Reference on page 22 System Schematic on page 40 Supported Storage and Backup Devices The server supports the following storage devices Fibre channel arrays SATA FC flash and SAS 3 SA...

Page 22: ... BOBxx CHx DIMM System Memory DIMMs DIMM_x Servicing DIMMs on page 69 Main module 1 SYS MB None Servicing the Main Module on page 97 Disk backplane 1 SYS DBP SAS_BACKPLANE Servicing the Main Module on page 97 Hard drive 8 SYS DBP HDDx System Storage Disks Disks_x Servicing Hard Drives on page 87 NVMe switch card optional 2 SYS MB PCIEx PCIESW NVMECARD Servicing NVMe Switch Cards on page 107 NVMe d...

Page 23: ...r IO module 1 SYS RIO System Networking Ethernet_NICs Servicing the Rear I O Module on page 173 Rear chassis subassembly RCSA 1 SYS RCSA None Servicing the Rear Chassis Subassembly on page 183 Related Information Front Panel Components on page 14 Rear Panel Components on page 15 Chassis Subassembly Components on page 17 Processor Module Components on page 18 Main Module Components on page 20 Suppo...

Page 24: ...24 SPARC T8 4 Server Service Manual January 2022 ...

Page 25: ...age 37 Clear a Fault Manually on page 39 Related Information Identifying Components on page 13 Component Service Categories on page 46 Preparing for Service on page 43 Returning the Server to Operation on page 191 Understanding Diagnostics These topics explain the diagnostic process and tools PSH Overview on page 25 Diagnostics Process on page 26 PSH Overview The PSH feature provides problem diagn...

Page 26: ...rmation Diagnostics Process on page 26 Checking for Faults on page 27 Diagnostics Process This table describes the diagnostics process Step Diagnostic Action Possible Outcome Links 1 Check the server for detected faults using these tools System LEDs on the front and rear panels fmadm faultycommand from the Oracle Solaris prompt or through the Oracle ILOM fault management shell Determine the faulty...

Page 27: ...hods to check for faults Interpreting LEDs on page 27 Log In to Oracle ILOM Service on page 32 Check for Faults on page 33 Interpreting LEDs Use these steps to determine if an LED indicates that a component has failed in the server Steps Description Links 1 Check the LEDs on the front and rear of the server Front Panel Controls and LEDs on page 29 Rear Panel Controls and LEDs on page 30 2 Check th...

Page 28: ...Ms on page 74 Determine Which Hard Drive Is Faulty on page 90 Determine Which Power Supply Is Faulty on page 138 Determine Which Fan Module Is Faulty on page 147 Determine Which PCIe Card Is Faulty on page 160 Determine if the Rear I O Module Is Faulty on page 176 Related Information Front Panel Controls and LEDs on page 29 Rear Panel Controls and LEDs on page 30 28 SPARC T8 4 Server Service Manua...

Page 29: ...madm faulty command provides details about any faults that cause this indicator to light See Check for Faults on page 33 Under some fault conditions individual component fault LEDs are lit in addition to the Server Service Required LED 3 Power OK LED green Indicates these conditions Off Server is not running in its normal state Server power might be off The SPM might be running Steady on Server is...

Page 30: ...at a temperature failure event has been acknowledged and a service action is required 6 Fan Module Fault LED amber Rear FM Indicates these conditions Off Indicates a steady state no service action is required Steady on Indicates that a fan module failure event has been acknowledged and a service action is required on at least one of the fan modules 7 PCIe Card Fault LED amber Rear PCIe Indicates t...

Page 31: ... Off No power is applied to the server Green Power is applied to the server 7 Locator LED and button white Turn on the Locator LED by pressing the Locator button or see Locate the Server on page 49 When lit the LED blinks rapidly 8 System Fault Service Action Required LED amber The fmadm faulty command provides details about any faults that cause this indicator to light See Check for Faults on pag...

Page 32: ...enable first time login and access to Oracle ILOM a default Administrator account and its password are provided with the system To build a secure environment you must change the default password changeme for the default Administrator account root after your initial login to Oracle ILOM If this default Administrator account has since been changed contact your system administrator for an Oracle ILOM...

Page 33: ...e ILOM fault management shell start SP faultmgmt shell Are you sure you want to start SP faultmgmt shell y n y faultmgmtsp fmadm faulty Time UUID msgid Severity 2014 08 27 19 46 26 4ec16c8d 5cdb c6ca c949 e24d3637ef27 PCIEX 8000 8R Major Problem Status solved Diag Engine unknown System Manufacturer Oracle Corporation Name SPARC T8 4 Part_Number 12345678 11 1 Serial_Number 1238BDC0DF Suspect 1 of 1...

Page 34: ... diagnosis faultmgmtsp In this example a fault is displayed that includes these details Date and time of the fault 2012 08 27 19 46 26 UUID 4e16c8d 5cdb c6ca c949 e24d3637ef27 which is unique to each fault Message identifier PCIEX 8000 8R which can be used to obtain additional fault information from Knowledge Base articles 3 Consider your next step If you are checking for faults while adding a sec...

Page 35: ...ing If PSH does not indicate the source of a fault check the message buffer and log files for notifications for faults Drive faults are usually captured by the Oracle Solaris message files These topics explain how to view the log files and system messages Check the Message Buffer on page 35 View Log Files Oracle Solaris on page 36 View Log Files Oracle ILOM on page 36 Check the Message Buffer The ...

Page 36: ...reated The original contents of the messages file are rotated to a file named messages 1 Over a period of time the messages are further rotated to messages 2 and messages 3 and then deleted 1 Log in as superuser 2 Type more var adm messages 3 To view all logged messages type more var adm messages Related Information Check the Message Buffer on page 35 View Log Files Oracle ILOM on page 36 View Log...

Page 37: ...POST displays These properties are described in Configure POST on page 37 If POST detects a faulty component the component is disabled automatically If the server is able to run without the disabled component the server boots when POST completes its tests For example if POST detects a faulty processor core the core is disabled POST completes its test sequence and the server boots using the remaini...

Page 38: ...equired for set r error_verbosity Diag verbosity when running after an error reset error_verbosity Possible values none min normal max error_verbosity User role required for set r hw_change_level Diag level when running after a hw change hw_change_level Possible values off min max hw_change_level User role required for set r hw_change_verbosity Diag verbosity when running after a hw change hw_chan...

Page 39: ...hange_verbosity normal level min mode normal power_on_level max power_on_verbosity normal trigger hw_change error reset verbosity normal Commands cd set show Related Information POST Overview on page 37 Clear a Fault Manually When PSH detects faults the faults are logged and displayed on the console In most cases after the fault is repaired the corrected state is detected by the server and the fau...

Page 40: ...mmand faultmgmtsp fmadm acquit UUID 4 Verify the fault is cleared Run the show disabled command to see if any components are still listed as faulty If there are disabled components repair the faults manually and continue to the next step to reset the server faultmgmtsp show disabled 5 If required reset the server faultmgmtsp exit reset System Are you sure you want to reset System y Resetting Syste...

Page 41: ...Rear Panel Components on page 15 Chassis Subassembly Components on page 17 Processor Module Components on page 18 Main Module Components on page 20 Supported Storage and Backup Devices on page 21 Component Service Task Reference on page 22 Detecting and Managing Faults 41 ...

Page 42: ...42 SPARC T8 4 Server Service Manual January 2022 ...

Page 43: ...r on page 50 8 Gain access to service components Chassis Subassembly Components on page 17 Safety Information For your protection observe the following safety precautions when setting up your equipment Follow all cautions and instructions marked on the equipment and described in the documentation shipped with your server Follow all cautions and instructions marked on the equipment and described in...

Page 44: ...rds and hard drives contain electronic components that are extremely sensitive to static electricity Ordinary amounts of static electricity from clothing or the work environment can destroy the components located on these boards Do not touch the components along their connector edges Caution You must disconnect all power supplies before servicing any of the components that are inside the chassis A...

Page 45: ...n page 49 Removing Power From the Server on page 50 Tools Needed for Service You will need the following tools for most service operations Antistatic wrist strap Antistatic mat No 1 Phillips screwdriver No 2 Phillips screwdriver No 1 flat blade screwdriver battery removal Related Information Safety Information on page 43 Tools Needed for Service on page 45 Component Fillers on page 46 Component Se...

Page 46: ... Safety Information on page 43 Tools Needed for Service on page 45 Component Service Categories on page 46 Find the Server Serial Number on page 47 Locate the Server on page 49 Prevent ESD Damage on page 49 Removing Power From the Server on page 50 Component Service Categories Replaceable components fall into these categories Hot serviceable by the customer Hot serviceable components can be remove...

Page 47: ...e 97 Power supply Off or On Servicing Power Supplies on page 135 Fan module Off or On Servicing Fan Modules on page 145 PCIe card Off or On Servicing PCIe Cards on page 155 Rear I O module Off X Servicing the Rear I O Module on page 173 Rear chassis subassembly Off X Servicing the Rear Chassis Subassembly on page 183 You must disconnect the ower cords before accessing this component Related Inform...

Page 48: ...yswitch_state Normal product_name T5 4 product_part_number 602 1234 01 product_serial_number 0723BBC006 fault_state OK clear_fault_action none power_state On Commands cd reset set show start stop Related Information Safety Information on page 43 Tools Needed for Service on page 45 Component Fillers on page 46 Component Service Categories on page 46 Locate the Server on page 49 Prevent ESD Damage o...

Page 49: ...ype set SYS LOCATE value Off Related Information Safety Information on page 43 Tools Needed for Service on page 45 Component Fillers on page 46 Component Service Categories on page 46 Find the Server Serial Number on page 47 Prevent ESD Damage on page 49 Removing Power From the Server on page 50 Prevent ESD Damage Many components contained in the processor modules and main module can be damaged by...

Page 50: ...r Modules on page 55 Servicing DIMMs on page 69 Servicing the Main Module on page 97 Servicing the SPM on page 117 Servicing the SCC PROM on page 125 Servicing the Battery on page 129 Servicing PCIe Cards on page 155 Servicing the Rear I O Module on page 173 Servicing the Rear Chassis Subassembly on page 183 Removing Power From the Server These topics describe different methods for removing power ...

Page 51: ...er See Power Off the Server Oracle ILOM on page 51 Power Off the Server Power Button Graceful Shutdown on page 52 Power Off the Server Power Button Emergency Shutdown on page 52 Related Information Prepare to Power Off the Server on page 51 Disconnect Power Cords on page 53 Power Off the Server Oracle ILOM You can use the SPM to perform a graceful shutdown of the server This type of shutdown ensur...

Page 52: ...is procedure places the server in the power standby mode 1 Press and release the recessed Power button The Power OK LED blinks rapidly 2 If you are powering off the server in order to add a second processor module return to Server Upgrade Process on page 56 Related Information Power Off the Server Oracle ILOM on page 51 Power Off the Server Power Button Emergency Shutdown on page 52 Power Off the ...

Page 53: ...wer Off the Server Oracle ILOM on page 51 Power Off the Server Power Button Graceful Shutdown on page 52 Power Off the Server Power Button Emergency Shutdown on page 52 2 Disconnect all power cords from the server Caution Because standby power is always present in the system you must unplug the power cords before accessing certain components Related Information Safety Information on page 43 Tools ...

Page 54: ...e If you plan to connect to the Oracle ILOM software over the network connect an Ethernet cable to the Ethernet port labeled NET MGT Note The SP uses the NET MGT out of band port by default You can configure the SP to share one of the sever s four Ethernet ports instead The SP uses only the configured Ethernet port If you plan to access the Oracle ILOM CLI through the management port connect a ser...

Page 55: ...Processor Module on page 67 Learn the process for upgrading the server from a single processor module configuration to a two processor module configuration Server Upgrade Process on page 56 Remove the processor module as part of another component s service operation Remove a Processor Module or Processor Filler Module on page 60 Install the processor module as part of another component s service o...

Page 56: ...ust match the size and capacity of the DIMMs already installed in the cpu node Understanding DIMM Configurations on page 69 4 Install the DIMMs Install a DIMM on page 81 5 Check the server for faults If any fault is present you must correct the fault and clear it from the server before you can continue with the upgrade Check for Faults on page 33 6 Shut down the server Removing Power From the Serv...

Page 57: ...ule or Processor Filler Module on page 60 Install a Processor Module or Processor Filler Module on page 63 Verify a Processor Module on page 67 Understanding PCIe Root Complex Connections on page 155 PCIe Card Configuration on page 158 Returning the Server to Operation on page 191 Processor Module Configuration Processor modules are accessed from the front of the server In Oracle ILOM the processo...

Page 58: ...les should have the same DIMMs configurations either all fully populated or all half populated See Understanding DIMM Configurations on page 69 Processor Module LEDs No LED Icon Description 1 Service Required amber Indicates that the processor module has experienced a fault condition 58 SPARC T8 4 Server Service Manual January 2022 ...

Page 59: ...age 67 Determine Which Processor Module Is Faulty The following LEDs are lit when a processor module fault is detected Front and rear System Fault Service Required LEDs Service Required LED on the faulty processor module 1 Determine if the Service Required LEDs are illuminated on the front panel or the rear I O module See Interpreting LEDs on page 27 2 From the front of the server check the proces...

Page 60: ...ect the power cords before servicing this component See Disconnect Power Cords on page 53 Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Prepare the server for service See Preparing for Service on page 43 2 Ensure that the server is powered off See Removing Power From the Server on pag...

Page 61: ...tion levers in toward the server and pull the extraction levers out to disengage the processor module or processor filler module from the server 6 Pull the processor module or processor filler module halfway out of the server and close the levers Servicing Processor Modules 61 ...

Page 62: ...r module or processor filler module and place the module on an antistatic mat Caution Do not touch the connectors at the rear of the module 8 Determine your next step If you are replacing or installing DIMMs within the processor module see Servicing DIMMs on page 69 If you are replacing a faulty processor module populate and install the replacement processor module 62 SPARC T8 4 Server Service Man...

Page 63: ...ion return to Preparing for Installation in SPARC T8 4 Server Installation Guide Related Information Processor Module Components on page 18 Processor Module LEDs on page 58 Server Upgrade Process on page 56 Determine Which Processor Module Is Faulty on page 59 Servicing DIMMs on page 69 Install a Processor Module or Processor Filler Module on page 63 Verify a Processor Module on page 67 Install a ...

Page 64: ...ine your next step If you are installing a processor module after replacing or installing DIMMs go to Step 3 If you are installing a new processor module to replace a faulty one install all of the DIMMs that you removed from the faulty processor module into the replacement module See Install a DIMM on page 81 3 Open the latches on the processor module or processor filler module and insert the modu...

Page 65: ...dule Note A processor filler module can only be installed in slot 1 4 Bring the levers together toward the center of the module and press the levers firmly against the module to fully seat the module back into the server Servicing Processor Modules 65 ...

Page 66: ... Operation on page 191 6 Verify the processor module functionality See Verify a Processor Module on page 67 7 If you are adding a second processor module to the server return to Server Upgrade Process on page 56 Related Information Processor Module Components on page 18 Server Upgrade Process on page 56 Processor Module LEDs on page 58 66 SPARC T8 4 Server Service Manual January 2022 ...

Page 67: ...rom the fmadm faulty command shows the replacement processor as disabled go to Detecting and Managing Faults on page 25 to clear the PSH detected fault from the server 2 Verify that the OK LED is lit on the processor module and that the Fault LED is not lit See Processor Module LEDs on page 58 3 Verify that the front and rear Service Required LEDs are not lit See Front Panel Controls and LEDs on p...

Page 68: ...n page 56 Related Information Processor Module Components on page 18 Processor Module LEDs on page 58 Determine Which Processor Module Is Faulty on page 59 Remove a Processor Module or Processor Filler Module on page 60 Install a Processor Module or Processor Filler Module on page 63 68 SPARC T8 4 Server Service Manual January 2022 ...

Page 69: ...rver components Description Links Understand how to replace DIMMs Understanding DIMM Configurations on page 69 Identifying DIMMs on page 71 Locate a faulty DIMM Determine Which DIMM Is Faulty PSH on page 74 Determine Which DIMM Is Faulty DIMM Fault LEDs on page 76 DIMM Configuration Errors on page 72 Replace a DIMM Remove a DIMM on page 78 Install a DIMM on page 81 Verify a DIMM on page 84 Underst...

Page 70: ...e The DIMM sparing feature is available only in fully populated servers All DIMMs associated with each CMx must be identical same size same rank classification Mixed configurations are supported DIMMs associated with CM0 with one size and DIMMs associated with CM1 with a different size as long as all DIMMs in the server have a supported rank classification For example 32 Gbyte 4Rx4 DIMMs associate...

Page 71: ...led in the server to verify that any replacement DIMMs are compatible or to confirm that upgrade DIMMs may be installed in a supported configuration As of System Firmware version 9 10 3 the following DIMM configurations are supported DIMM Capacity DRAM Density Rank Classification Label 16 Gbyte 4 Gbit Dual rank x4 2Rx4 32 Gbyte 4 Gbit Quad rank x4 4Rx4 32 Gbyte 8 Gbit Dual rank x4 2Rx4 64 Gbyte 8 ...

Page 72: ...d a message such as the following is displayed WARNING Running with a nonstandard DIMM configuration Refer to service document for details In other cases the configuration error is fatal and the following message is displayed Fatal configuration error forcing power down In addition to these general memory configuration errors one or more rule specific messages is displayed indicating the type of c...

Page 73: ... DIMM NAC names are based both on the location of the DIMM slot on the processor module and in which slot the processor module is installed For example the full NAC name for the DIMM installed in the front left corner on a processor module installed at PM0 is SYS PM0 CM1 CMP BOB00 CH0 DIMM Related Information Servicing Processor Modules on page 55 Servicing DIMMs 73 ...

Page 74: ...he server has a memory problem run the Oracle ILOM show faulty command This command lists memory faults and identifies the DIMM modules associated with the fault Related Information Determine Which DIMM Is Faulty PSH on page 74 Determine Which DIMM Is Faulty DIMM Fault LEDs on page 76 Determine Which DIMM Is Faulty PSH The Oracle Fault Management tool fmadm faulty displays current server faults in...

Page 75: ...07042208 M393B1K70DH0 YK0 Revision 04 Serial_Number 00CE0212153367DD4B Chassis Manufacturer Oracle Corporation Name SPARC T8 4 Part_Number 7021179 Serial_Number 1201CTHC01 Description Uncorrectable errors have occurred while accessing memory Response An attempt will be made to remove the affected memory from service Host HW may restart Impact Total system memory capacity has been reduced and some ...

Page 76: ...tic discharge This discharge can cause failure of server components 1 Consider your first steps Familiarize yourself with DIMM configuration rules See Understanding DIMM Configurations on page 69 Prepare the system for service See Preparing for Service on page 43 Remove the processor module containing the faulty DIMM Place the processor module on an ESD protect work surface Remove the processor mo...

Page 77: ...s illuminated An illuminated Memory Riser Power LED indicates that there is power available to illuminate any Memory DIMM Fault LEDs once you have pressed the DIMM Fault Remind button 4 Press the DIMM Fault Remind button on the processor module This will cause the Memory DIMM Fault LEDs associated with any faulty DIMMs to illuminate for a few minutes Servicing DIMMs 77 ...

Page 78: ...nents on page 18 Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Consider your first steps Familiarize yourself with DIMM population rules See Understanding DIMM Configurations on page 69 Prepare the system for service See Preparing for Service on page 43 Remove the processor module Pla...

Page 79: ...cover and slide the cover back and up off the main module 3 Locate the DIMMs that need to be replaced See Determine Which DIMM Is Faulty DIMM Fault LEDs on page 76 4 Push down on the ejector tabs on each side of the DIMM until the DIMM is released Servicing DIMMs 79 ...

Page 80: ...r next step If you are installing replacement DIMMs at this time go to Install a DIMM on page 81 If you are not installing replacement DIMMs at this time go to Step 9 9 Return the server to operation See Install the processor module See Install a Processor Module or Processor Filler Module on page 63 Power on the server See Power On the Server Oracle ILOM on page 192 Verify DIMM functionality See ...

Page 81: ...t are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Consider your first steps Familiarize yourself with DIMM population rules See Understanding DIMM Configurations on page 69 Prepare the system for service See Preparing for Service on page 43 Remove the processor module Place the processor module on an ESD protect work surface See Remove a Processor M...

Page 82: ...at the ejector tabs on the connector that will receive the DIMM are in the open position 5 Align the DIMM notch with the key in the connector Caution Ensure that the orientation is correct The DIMM might be damaged if the orientation is reversed 6 Push the DIMM into the connector until the ejector tabs lock the DIMM in place If the DIMM does not easily seat into the connector check the DIMM s orie...

Page 83: ...If you are replacing a processor module after installing replacement DIMMs proceed to Step 10 10 Finish the installation procedure See Install the processor module See Install a Processor Module or Processor Filler Module on page 63 Return the server to operation See Returning the Server to Operation on page 191 Verify DIMM functionality See Verify a DIMM on page 84 Related Information Understandi...

Page 84: ... those cases the fault is automatically cleared from the server If show faulty still displays the fault the set command clears it set SYS MB CM0 CMP MR0 BOB1 CH0 DIMM clear_fault_action true Are you sure you want to clear SYS MB CM0 CMP MR0 BOB1 CH0 DIMM y n y Set clear_fault_action to true 3 For a host detected fault perform the following steps to verify the new DIMM a Set the virtual keyswitch t...

Page 85: ...t the OpenBoot prompt ok go to the next step f If the server remains at the OpenBoot prompt type boot g Return the virtual keyswitch to Normal mode set HOST keyswitch_state Normal Set keyswitch_state to Normal h Switch to the system console and type fmadm faulty If any faults are reported refer to the diagnostics instructions described in Detecting and Managing Faults on page 25 4 Switch to the Or...

Page 86: ...h the UUID Use the same UUID that was displayed from the output of the Oracle ILOM show faulty command For example fmadm repair 3aa7c854 9667 e176 efe5 e487e520 Related Information Understanding DIMM Configurations on page 69 Understanding DIMM Configurations on page 69 DIMM Configuration Errors on page 72 Determine Which DIMM Is Faulty DIMM Fault LEDs on page 76 Determine Which DIMM Is Faulty PSH...

Page 87: ...ver components These topics describe service procedures for the hard drives in the server Hard Drive Configuration on page 87 Hard Drive Configuration on page 87 Hard Drive LEDs on page 89 Determine Which Hard Drive Is Faulty on page 90 Remove a Hard Drive on page 90 Install a Hard Drive on page 93 Verify a Hard Drive on page 94 Hard Drive Configuration You can install a mix of hard drives and sol...

Page 88: ...e offline prevents any applications from accessing it and removes logical software links to it You cannot hot service a drive in the following situations If the drive contains the operating system and the operating system is not mirrored on another drive If the drive cannot be logically isolated from the online operations of the server If either of these conditions apply to the drive being service...

Page 89: ...e operation 2 Service Required amber Indicates that the drive has experienced a fault condition 3 OK Activity green Indicates the drive s availability for use On Read or write activity is in progress Off Drive is idle and available for use Related Information Hard Drive Configuration on page 87 Determine Which Hard Drive Is Faulty on page 90 Remove a Hard Drive on page 90 Install a Hard Drive on p...

Page 90: ...mber Service Required LED is lit on the drive that needs to be replaced 3 Remove the faulty drive See Remove a Hard Drive on page 90 Related Information Hard Drive Configuration on page 87 Hard Drive LEDs on page 89 Remove a Hard Drive on page 90 Install a Hard Drive on page 93 Verify a Hard Drive on page 94 Remove a Hard Drive Hard drives are hot service components that can be replaced by custome...

Page 91: ... drives that are not configured cfgadm al This command lists dynamically reconfigurable hardware resources and shows their operational status In this case look for the status of the drive you plan to remove This information is listed in the Occupant column Example Ap_id Type Receptacle Occupant Condition c2 scsi sas connected configured unknown c2 w5000cca00a76d1f5 0 disk path connected configured...

Page 92: ... LED on the drive is lit 4 Press the drive release button to unlock the drive 5 Pull on the latch to remove the drive from the server Caution The latch is not an ejector Do not force the latch too far to the right Doing so can damage the latch 6 Install the replacement drive or a filler tray 92 SPARC T8 4 Server Service Manual January 2022 ...

Page 93: ...re requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Align the replacement drive to the drive slot and slide the drive in until it is seated Drives are physically addressed according to the slot in which they are installed If you are replacing a drive install the replacement drive in the same slot as the drive ...

Page 94: ... was not the boot device boot the OS Depending on the nature of the replaced drive you might need to perform administrative tasks to reinstall software before the server can boot Refer to the Oracle Solaris OS administration documentation for more information 3 At the Oracle Solaris prompt type the cfgadm al command to list all drives in the device tree including any drives that are not configured...

Page 95: ...w5000cca00a76d1f5 0 disk path connected configured unknown c3 scsi sas connected configured unknown c3 w5000cca00a772bd1 0 disk path connected configured unknown c4 scsi sas connected configured unknown c4 w5000cca00a59b0a9 0 disk path connected configured unknown 7 Perform one of the following tasks based on your verification results If the previous steps did not verify the drive see Diagnostics ...

Page 96: ...Verify a Hard Drive Remove a Hard Drive on page 90 Install a Hard Drive on page 93 96 SPARC T8 4 Server Service Manual January 2022 ...

Page 97: ...ermine if the main module is faulty Main Module LEDs on page 98 2 Prepare the server for service Preparing for Service on page 43 3 Remove the main module Remove the Main Module on page 99 4 Service main module components Servicing the Main Module on page 97 Servicing NVMe Switch Cards on page 107 Servicing the SPM on page 117 Servicing the SCC PROM on page 125 Servicing the Battery on page 129 5 ...

Page 98: ... Power OK LED green Indicates these conditions Off System is not running in its normal state System power might be off The SPM might be running Steady on System is powered on and is running in its normal operating state No service actions are required Fast blink System is running in standby mode and can be quickly returned to full function Slow blink A normal but transitory activity is taking plac...

Page 99: ... Service Required LED is lit when the server detects a main module fault Related Information Main Module LEDs on page 98 Remove the Main Module on page 99 Install the Main Module on page 102 Remove the Main Module 1 Optional If you are replacing a faulty main module you must back up ILOM configuration settings a Configure the SER MGT port to enable the configuration parameters to be uploaded Refer...

Page 100: ...Press the levers back toward the center of the main module This will keep the levers from being damaged when the main module is outside the server Caution Due to the weight of the main module the following step requires two people to perform Do not attempt to lift the main module alone 8 Remove the main module completely from the server 9 Consider your next steps If you have removed the main modul...

Page 101: ...de the cover back and up off the main module 10 Determine your next step If you are replacing a main module due to a faulty motherboard remove all of the internal components and transfer them to the new main module If you are replacing a component inside the main module use one of the following links Servicing the SPM on page 117 Servicing the Battery on page 129 Servicing the Main Module 101 ...

Page 102: ...s on page 20 Main Module LEDs on page 98 Install the Main Module on page 102 Install the Main Module 1 Place the cover back onto the main module and slide the cover forward until the latch clicks into place 2 Open the levers so that they are fully open 102 SPARC T8 4 Server Service Manual January 2022 ...

Page 103: ... Do not attempt to lift the main module alone 3 Insert the main module into its slot in the server until the levers begin to engage 4 Press the levers back together toward the center of the module then press the levers firmly against the module to fully seat the module back into the server Servicing the Main Module 103 ...

Page 104: ...dule with a new one connect a terminal or a terminal emulator PC or workstation to the SER MGT port The following message is delivered over the serial management port Unrecognized Chassis This module is installed in an unknown or unsupported chassis You must upgrade the firmware to a newer version that supports this chassis 7 Download the system firmware 104 SPARC T8 4 Server Service Manual Januar...

Page 105: ...ntation http www oracle com goto ilom docs 8 Power on the server See Returning the Server to Operation on page 191 Related Information Main Module Components on page 20 Main Module LEDs on page 98 Remove the Main Module on page 99 Verify the Main Module 1 Verify that the main module Service Required LED is not lit See Main Module LEDs on page 98 2 Verify that the front and rear System Service Requ...

Page 106: ...ule Related Information Main Module LEDs on page 98 Determine if the Main Module Is Faulty on page 99 Remove the Main Module on page 99 Install the Main Module on page 102 106 SPARC T8 4 Server Service Manual January 2022 ...

Page 107: ...nstalled in the main module If you are replacing a faulty main module you must remove the NVMe switch cards to transfer them to the new main module Part Description 1 NVMe Switch 2 SYS MB PCIE2 PCIESW 2 NVMe Switch 1 SYS MB PCIE1 PCIESW Servicing NVMe Switch Cards 107 ...

Page 108: ...es on page 113 Verify a NVMe Switch Card on page 114 Disconnect the NVMe Cables 1 Remove the main module See Remove the Main Module on page 99 2 Determine your next step If you are replacing a faulty NVMe switch card unplug the NVMe cables from the card If you are moving the NVMe switch cards to a new main module unplug the cables from the backplane If you are replacing the NVMe cables unplug the ...

Page 109: ...he cable connectors so can install them correctly 3 Remove the NVMe switch card See Remove a NVMe Switch Card on page 109 Remove a NVMe Switch Card 1 Identify which NVMe switch card you want to remove 2 Unlock the card Servicing NVMe Switch Cards 109 ...

Page 110: ...om the card bracket 3 Push the card away from its connector on the motherboard and lift the card out of the main module Install a NVMe Switch Card 1 Align the NVMe switch card with its connector on the motherboard 110 SPARC T8 4 Server Service Manual January 2022 ...

Page 111: ...Install a NVMe Switch Card Note Insert the rear edge of the NVMe switch card into the corresponding tab on the motherboard 2 Insert the card into its connector Servicing NVMe Switch Cards 111 ...

Page 112: ...stall a NVMe Switch Card The card is inserted laterally into the motherboard connector 3 Lock the card Rotate the retention lever toward the card bracket 112 SPARC T8 4 Server Service Manual January 2022 ...

Page 113: ...Connect the NVMe Cables Connect the NVMe Cables 1 Plug the two NVMe data cables into the connectors on the NVMe switch card 2 Install the NVMe cable clamp Servicing NVMe Switch Cards 113 ...

Page 114: ...n tuck the NVMe cable into the cable clamp 3 Install the main module See Install the Main Module on page 102 Verify a NVMe Switch Card 1 Use the Oracle ILOM fault management shell to determine if the replacement NVMe switch card is shown as enabled or disabled 114 SPARC T8 4 Server Service Manual January 2022 ...

Page 115: ...Verify that the front and rear Service Required LEDs are not lit See Front Panel Controls and LEDs on page 29 and Rear Panel Controls and LEDs on page 30 3 Consider these options If the previous steps did not clear the fault see Diagnostics Process on page 26 If Step 1 and Step 2 indicate that no faults have been detected then the processor module has been replaced successfully No further action i...

Page 116: ...116 SPARC T8 4 Server Service Manual January 2022 ...

Page 117: ...all the Main Module on page 102 5 Verify the replacement SPM Verify the SPM on page 124 Determine if the SPM Is Faulty The following LEDs are illuminated when a SPM fault is detected System Service Required LEDs on the front panel and rear I O module Server SP LED on the main module or rear I O module 1 Determine if the Server Service Required LEDs are illuminated on the front panel or the rear I ...

Page 118: ...at are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Back up the SPM configuration information before removing the SPM At the Oracle ILOM prompt type cd SP config dump destination uri target where The acceptable values for uri are tftp ftp sftp scp http https target is the remote location where you want to store the configuration information For examp...

Page 119: ...essary ESD precautions See Prevent ESD Damage on page 49 3 Remove the main module from the server See Remove the Main Module on page 99 4 Locate the SPM on the main module See Main Module Components on page 20 Servicing the SPM 119 ...

Page 120: ...disengage the SPM from the connectors on the motherboard 6 Lift the SPM up and away from the motherboard Related Information Determine if the SPM Is Faulty on page 117 Install the SPM on page 121 Verify the SPM on page 124 120 SPARC T8 4 Server Service Manual January 2022 ...

Page 121: ... are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Take the necessary ESD precautions See Prevent ESD Damage on page 49 2 Lower the side of the SPM with the Align Tab sticker down on the service processor tab on the motherboard Servicing the SPM 121 ...

Page 122: ...s this chassis If you see this message go on to Step 6 If you do not see this message go to Step Step 7 6 Download the system firmware a Configure the SER MGT port to enable the firmware image to be downloaded Refer to the Oracle ILOM documentation for network configuration instructions b Download the system firmware Follow the firmware download instructions in the Oracle ILOM documentation Note Y...

Page 123: ...zing TPM using the Oracle ILOM interface to enable failover see Securing Systems and Attached Devices in Oracle Solaris 11 3 b Restore the TPM data and keys that were backed up to the new SP you install For information about migrating or restoring TPM data and keys see Securing Systems and Attached Devices in Oracle Solaris 11 3 9 Verify the installation of the SPM See Verify the SPM on page 124 1...

Page 124: ...he main module or rear I O module is lit green See Main Module LEDs on page 98 or Rear I O Module LEDs on page 173 2 Verify that the front and rear Service Required LEDs are not lit See Interpreting LEDs on page 27 3 Consider these options If the previous steps did not clear the fault see Diagnostics Process on page 26 If the previous steps indicate that no faults have been detected then the SPM h...

Page 125: ...dule on page 99 2 Replace the SCC PROM Remove the SCC PROM on page 125 Install the SCC PROM on page 126 3 Install the main module Install the Main Module on page 102 4 Verify the SCC PROM Verify the SCC PROM on page 128 Remove the SCC PROM The SCC PROM is a cold service component that can be replaced only by authorized service personnel To identify and locate the SCC PROM see Processor Module Comp...

Page 126: ...e Main Module Components on page 20 4 Grasp the SCC PROM and lift it up to remove it from the main module Related Information Install the SCC PROM on page 126 Install the SCC PROM Before beginning this procedure ensure that you are familiar with the cautions and safety instructions described in Safety Information on page 43 126 SPARC T8 4 Server Service Manual January 2022 ...

Page 127: ...CC PROM properly onto the main module 2 Press down on the SCC PROM until it is completely seated on the main module 3 Insert the main module back into the server See Install the Main Module on page 102 4 Return the server to operation See Returning the Server to Operation on page 191 Related Information Remove the SCC PROM on page 125 Verify the SCC PROM on page 128 Servicing the SCC PROM 127 ...

Page 128: ...itional verification run specific commands to display data stored in the SCC PROM Use the Oracle ILOM show command to display the MAC address show HOST macaddress HOST Properties macaddress Use Oracle Solaris OS commands to display the hostid and Ethernet address hostid 8534299c ifconfig a lo0 flags 2001000849 UP LOOPBACK RUNNING MULTICAST IPv4 VIRTUAL mtu 8232 index 1 inet 127 0 0 1 netmask ff000...

Page 129: ...y 1 Prepare the host for battery replacement To correctly reset the date and time before replacing a battery you must revent the host from automatically powering on and disable any NTP connections a Check the HOST_AUTO_POWER_ON property show SP policy HOST_AUTO_POWER_ON Properties HOST_AUTO_POWER_ON enabled b If enabled set the HOST_AUTO_POWER_ON property to disabled set SP policy HOST_AUTO_POWER_...

Page 130: ...re a Prepare the server for service b Remove the main module from the server See Remove the Main Module on page 99 c Locate the battery in the main module See Main Module Components on page 20 d Remove the old battery Gently push the battery toward the side of the server to release it from the retention clip e Unpack and install the new battery 130 SPARC T8 4 Server Service Manual January 2022 ...

Page 131: ... Module on page 102 g Return the Server to Operation 3 Reset the system clock a Use the Oracle ILOM clock command to reset the system clock The following example sets the date to August 22 2016 and the timezone to EDT set SP clock datetime 081221302016 timezone EDT Set datetime to 081221302016 set timezone to EDT show d properties SP clock Properties datetime Mon Aug 22 13 20 16 2016 Servicing the...

Page 132: ...aults on page 25 Verify the Battery 1 Run show SYS MB V_BAT to check the status of the system battery In the output the SYS MB BAT status should be OK as in the following example show SYS MB BAT Targets Properties type Battery ipmi_name MB BAT class Threshold Sensor value 3 140 Volts upper_nonrecov_threshold N A upper_critical_threshold N A upper_noncritical_threshold N A lower_noncritical_thresho...

Page 133: ...Verify the Battery Related Information Replace the Battery on page 129 Servicing the Battery 133 ...

Page 134: ...134 SPARC T8 4 Server Service Manual January 2022 ...

Page 135: ...andle components that are sensitive to electrostatic discharge This discharge can cause failure of server components These topics describe service procedures for the power supplies in the server Power Supply Configuration on page 135 Power Supply and AC Power Connector LEDs on page 137 Determine Which Power Supply Is Faulty on page 138 Remove a Power Supply on page 139 Install a Power Supply on pa...

Page 136: ...Supply Configuration 1 Power supply 0 PS0 2 Power supply 1 PS1 3 Power supply 2 PS2 4 Power supply 3 PS3 Power cords are accessed from the rear of the server 136 SPARC T8 4 Server Service Manual January 2022 ...

Page 137: ... Information Power Supply and AC Power Connector LEDs on page 137 Determine Which Power Supply Is Faulty on page 138 Remove a Power Supply on page 139 Install a Power Supply on page 142 Verify a Power Supply on page 144 Power Supply and AC Power Connector LEDs Each power supply is provided with a set of three LEDs which are located at the front of the server Servicing Power Supplies 137 ...

Page 138: ... power supply Each AC power connector has a single LED that is located on the rear I O module See Interpreting LEDs on page 27 Related Information Power Supply Configuration on page 135 Determine Which Power Supply Is Faulty on page 138 Remove a Power Supply on page 139 Install a Power Supply on page 142 Verify a Power Supply on page 144 Determine Which Power Supply Is Faulty The following LEDs ar...

Page 139: ...pply on page 139 Install a Power Supply on page 142 Verify a Power Supply on page 144 Remove a Power Supply The power supply is a hot service component that can be replaced by a customer Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Locate the power supply that you want to remove See ...

Page 140: ...e front of the server and on the power supply to be removed squeeze the release latches together then pull the extraction lever toward you to disengage the power supply from the server 140 SPARC T8 4 Server Service Manual January 2022 ...

Page 141: ...ly See Install a Power Supply on page 142 Related Information Power Supply Configuration on page 135 Power Supply and AC Power Connector LEDs on page 137 Determine Which Power Supply Is Faulty on page 138 Install a Power Supply on page 142 Verify a Power Supply on page 144 Servicing Power Supplies 141 ...

Page 142: ...e components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Open the latch on the replacement power supply and align the power supply with the empty bay Verify that the power supply is oriented as shown in the following figure 2 Slide the power supply into the chassis 142 SPARC T8 4 Server Service Manual January 2022 ...

Page 143: ...st the power supply to fully seat the power supply in the server 4 Insert the power cord into the AC connector for the power supply that you just installed 5 Verify the power supply See Verify a Power Supply on page 144 Servicing Power Supplies 143 ...

Page 144: ... 2 Verify that the front and rear Service Required LEDs are not lit See Interpreting LEDs on page 27 3 Consider these options If the previous steps did not clear the fault see Diagnostics Process on page 26 If Step 1 and Step 2 indicate that no faults have been detected then the power supply has been replaced successfully No further action is required Related Information Power Supply Configuration...

Page 145: ... only when four or five fan modules are operational These topics describe service procedures for the fan modules in the server Fan Module Configuration on page 146 Fan Module LED on page 147 Determine Which Fan Module Is Faulty on page 147 Remove a Fan Module on page 148 Install a Fan Module on page 151 Verify a Fan Module on page 152 Servicing Fan Modules 145 ...

Page 146: ...e 0 2 Fan module 1 3 Fan module 2 4 Fan module 3 5 Fan module 4 Related Information Determine Which Fan Module Is Faulty on page 147 Remove a Fan Module on page 148 Install a Fan Module on page 151 Verify a Fan Module on page 152 146 SPARC T8 4 Server Service Manual January 2022 ...

Page 147: ...termine if the System Service Required LEDs are illuminated on the front panel or the rear I O module See Interpreting LEDs on page 27 2 Determine if the Server Fan Fail LED on the front panel is illuminated See Front Panel Controls and LEDs on page 29 3 From the rear of the server check the fan module LEDs to identify which fan module needs to be replaced The fan module Service Required LED is il...

Page 148: ...anel Components on page 15 for the locations of the fan modules in the server See Determine Which Fan Module Is Faulty on page 147 to locate a faulty fan module 2 Determine if you can remove the fan module with the server running or not See Fan Module Configuration on page 146 to determine if you can remove a fan module with the server running or if you must shut down the server before removing a ...

Page 149: ...Remove a Fan Module 3 Press the green button to disengage the fan module from the chassis Servicing Fan Modules 149 ...

Page 150: ...out of the server Related Information Fan Module Configuration on page 146 Determine Which Fan Module Is Faulty on page 147 Install a Fan Module on page 151 DIMM Configuration Errors on page 72 150 SPARC T8 4 Server Service Manual January 2022 ...

Page 151: ...ution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Insert the fan module into the empty fan module slot The fan snaps into position with an audible click 2 Power on the server if necessary Servicing Fan Modules 151 ...

Page 152: ...r to the server See Connect Power Cords on page 191 Powered on the server See Power On the Server Oracle ILOM on page 192 2 Check the front or rear panel LEDs for the following Green System OK LED lit Amber System Fault LED not lit Amber System Fan Fault LED not lit See Front Panel Controls and LEDs on page 29 and Rear Panel Controls and LEDs on page 30 If these conditions are met continue to Step...

Page 153: ...and to check for faults If faults are reported see Diagnostics Process on page 26 If no faults are reported then the fan module has been replaced successfully Related Information Determine Which Fan Module Is Faulty on page 147 Remove a Fan Module on page 148 Install a Fan Module on page 151 Servicing Fan Modules 153 ...

Page 154: ...154 SPARC T8 4 Server Service Manual January 2022 ...

Page 155: ...age 160 Remove a PCIe Card Carrier on page 161 Remove a PCIe Card on page 164 Install a PCIe Card on page 167 Install a PCIe Card Carrier on page 170 Verify a PCIe Card on page 171 Understanding PCIe Root Complex Connections All 16 PCIe slots support PCIe cards with the following characteristics Hot plug low profile adapters x8 Gen1 x8 Gen2 and x8 Gen3 cards In addition the following PCIe slots su...

Page 156: ... the root complex Understanding the relationship of the PCIe root complexes to the PCIe I O fabrics will help you properly assign devices when configuring Oracle VM Server for SPARC logical domains This diagram illustrates the root complex connections between the four CPUs and the 16 PCIe I O slots Each CPU supports all I O root complex fabrics In single PM configurations all 156 SPARC T8 4 Server...

Page 157: ...pci 1 MR1 3 0 x16 Slot 8 pci 30a pci 1 MR1 2 0 x16 Slot 9 pci 30b pci 2 MR2 0 1 x8 Slot 10 pci 30b pci 1 MR2 0 0 x8 Slot 11 pci 30c pci 1 MR2 3 0 x16 Slot 12 pci 30d pci 1 MR3 1 0 x16 Slot 13 pci 30e pci 2 MR3 0 1 x8 Slot 14 pci 30e pci 1 MR3 0 0 x8 Slot 15 pci 30f pci 1 MR3 3 0 x16 Slot 16 pci 310 pci 1 MR3 2 0 x16 If you are reviewing root complex changes after adding a second processor module r...

Page 158: ...fficient For example you may distribute PCIe cards evenly across available root complexes If you are reviewing PCIe installation order after adding a second processor module return to Server Upgrade Process on page 56 Related Information System Schematic on page 40 Understanding PCIe Root Complex Connections on page 155 PCIe Carrier Handle and LEDs on page 159 Determine Which PCIe Card Is Faulty o...

Page 159: ...ss this button again to bring the PCIe card online 3 OK green Indicates the following conditions Off The server is powered off or the PCIe card is not operating You can remove the PCIe card or install a new card On The PCIe card is connected and online Do not insert or remove the card Blinking The PCIe card is powering up or powering down Do not insert or remove the card Servicing PCIe Cards 159 ...

Page 160: ...n page 164 Install a PCIe Card on page 167 Install a PCIe Card Carrier on page 170 Verify a PCIe Card on page 171 Determine Which PCIe Card Is Faulty The following LEDs are illuminated when a fault is detected System Service Required LEDs on the front panel and rear I O module System PCIe Fault LED on the front panel Service Required LED on card carrier containing the faulty PCIe card 1 Determine ...

Page 161: ...talling a PCIe card that requires a double wide carrier you must remove two adjacent PCIe card carriers Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components Note Removing PCIe card carriers while the server is at the OpenBoot prompt is not supported The server must be either powered off or boo...

Page 162: ...ces and shows their operational status In this case look for the status of the PCIe card you plan to remove This information is listed in the State column For example hotplug list cv Connection State Description ______________________________________________________________________________ PCIE1 EMPTY PCIe Native PCIE7 ENABLED PCIe Native Device Usage ______________________________________________...

Page 163: ...bles connected to the PCIe card Tip Label the cables to ensure proper connection to the replacement PCIe card 6 Pull the PCIe card carrier handle down to disengage the carrier from the card cage 7 Remove the PCIe card carrier from the server Related Information Understanding PCIe Root Complex Connections on page 155 PCIe Card Configuration on page 158 PCIe Carrier Handle and LEDs on page 159 Deter...

Page 164: ... Card on page 171 Remove a PCIe Card Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Ensure that you have already taken antistatic measures See Prevent ESD Damage on page 49 164 SPARC T8 4 Server Service Manual January 2022 ...

Page 165: ...Remove a PCIe Card 2 Unlatch and open the PCIe card carrier top cover Servicing PCIe Cards 165 ...

Page 166: ...n Understanding PCIe Root Complex Connections on page 155 PCIe Card Configuration on page 158 PCIe Carrier Handle and LEDs on page 159 Determine Which PCIe Card Is Faulty on page 160 Remove a PCIe Card Carrier on page 161 Install a PCIe Card on page 167 Install a PCIe Card Carrier on page 170 Verify a PCIe Card on page 171 166 SPARC T8 4 Server Service Manual January 2022 ...

Page 167: ...ge can cause failure of server components 1 Determine your first step If you are installing a new PCIe card and need an empty PCIe card carrier see Remove a PCIe Card Carrier on page 161 If you are replacing a faulty PCIe card and have already removed its carrier from the server go to Step 2 2 Remove the PCIe card from its packaging Servicing PCIe Cards 167 ...

Page 168: ...r s connector Caution Do not twist or turn the PCIe card as you insert it into the PCIe card carrier Ensure that the PCIe card s connector is fully seated in the PCIe card carrier s slot and that the notch in the PCIe card s rear bulkhead is seated around the PCIe carrier s alignment tab 168 SPARC T8 4 Server Service Manual January 2022 ...

Page 169: ...mplex Connections on page 155 PCIe Card Configuration on page 158 PCIe Carrier Handle and LEDs on page 159 Determine Which PCIe Card Is Faulty on page 160 Remove a PCIe Card Carrier on page 161 Remove a PCIe Card on page 164 Install a PCIe Card Carrier on page 170 Verify a PCIe Card on page 171 Servicing PCIe Cards 169 ...

Page 170: ... be powered off or booted into the Oracle Solaris OS 1 Insert the PCIe card carrier into the card cage until it stops Caution Do not press on the PCIe back panel or force the PCIe card carrier into the card cage 2 Close the PCIe carrier handle Rotate the handle up until it latches into place 3 Reconnect the cables to the PCIe card 4 Determine your next step If you replaced or installed a PCIe card...

Page 171: ... Root Complex Connections on page 155 PCIe Card Configuration on page 158 PCIe Carrier Handle and LEDs on page 159 Determine Which PCIe Card Is Faulty on page 160 Remove a PCIe Card Carrier on page 161 Remove a PCIe Card on page 164 Install a PCIe Card on page 167 Verify a PCIe Card on page 171 Verify a PCIe Card 1 Verify that the Fault LED is not illuminated on the PCIe card 2 Verify that the Sys...

Page 172: ...E1 EMPTY PCIe Native PCIE7 ENABLED PCIe Native Device Usage ___________________________________________________________________________ SUNW qlc 0 fp disk fp 0 0 SUNW qlc 0 1 fp disk fp 0 0 PCIE13 EMPTY PCIe Native PCIE15 EMPTY PCIe Native Related Information Understanding PCIe Root Complex Connections on page 155 PCIe Card Configuration on page 158 PCIe Carrier Handle and LEDs on page 159 Determi...

Page 173: ...sconnect Power Cords on page 53 These topics describe service procedures for the rear I O module in the server Rear I O Module LEDs on page 173 Determine if the Rear I O Module Is Faulty on page 176 Remove the Rear I O Module on page 176 Install the Rear I O Module on page 178 Verify the Rear I O Module on page 181 Rear I O Module LEDs The LEDs on the rear I O module give server status information...

Page 174: ...ing conditions Blinking A link is established Off No link is established 4 NET link and activity green Indicates the following conditions On A link is established Blinking Transfer activity is present on the link Off No link is established 5 NET speed amber green Indicates the following conditions Green on The link is operating as a 100 Mbps connection Off There is no link 6 AC1 connector LED ambe...

Page 175: ...red on and is running in its normal operating state No service actions are required Fast blink Server is running in standby mode and can be quickly returned to full function Slow blink A normal but transitory activity is taking place Slow blinking might indicate that server diagnostics are running or the server is booting 10 Service Processor LED SP Indicates the following conditions Off The AC po...

Page 176: ...le LEDs on page 173 Remove the Rear I O Module on page 176 Install the Rear I O Module on page 178 Verify the Rear I O Module on page 181 Remove the Rear I O Module The rear I O module is a cold service component that can be replaced by a customer 1 Take the necessary ESD precautions See Prevent ESD Damage on page 49 2 Locate the failed rear I O module See Rear Panel Components on page 15 for the ...

Page 177: ...ed to the ports on the rear I O module and then disconnect the cables from the ports You will reconnect the cables to the same ports on the replacement rear I O module 6 Press the green buttons on the rear I O module ejection levers and spread the levers open to eject the rear I O module Servicing the Rear I O Module 177 ...

Page 178: ... eUSB devices on both boards Related Information Preparing for Service on page 43 Rear I O Module LEDs on page 173 Determine if the Rear I O Module Is Faulty on page 176 Install the Rear I O Module on page 178 Verify the Rear I O Module on page 181 Install the Rear I O Module 1 Take the necessary ESD precautions 178 SPARC T8 4 Server Service Manual January 2022 ...

Page 179: ...stall the Rear I O Module See Prevent ESD Damage on page 49 2 With the levers in the extended position insert the rear I O module into the slot at the rear of the server Servicing the Rear I O Module 179 ...

Page 180: ... click into place to fully seat the rear I O module into the server 4 Connect the cables to the appropriate ports on the rear I O module 5 Connect the power cords See Connect Power Cords on page 191 6 Power on the server 180 SPARC T8 4 Server Service Manual January 2022 ...

Page 181: ... Connect Power Cords on page 191 Started the system See Power On the Server Oracle ILOM on page 192 2 Verify that the System Service Required LED on the rear I O module is not lit See Rear I O Module LEDs on page 173 3 Log in to Oracle ILOM See Log In to Oracle ILOM Service on page 32 4 Start the faultmagmt shell start SP faultmgmt shell Are you sure you want to start the faultmgmt shell y n y fau...

Page 182: ...cting and Managing Faults on page 25 Rear I O Module LEDs on page 173 Determine if the Rear I O Module Is Faulty on page 176 Remove the Rear I O Module on page 176 Install the Rear I O Module on page 178 182 SPARC T8 4 Server Service Manual January 2022 ...

Page 183: ...nent See Disconnect Power Cords on page 53 Rear Chassis Subassembly Components on page 183 Remove the Rear Chassis Subassembly on page 184 Install the Rear Chassis Subassembly on page 187 Verify the Rear Chassis Subassembly on page 188 Related Information Identifying Components on page 13 Detecting and Managing Faults on page 25 Preparing for Service on page 43 Returning the Server to Operation on...

Page 184: ...bly 1 Verify that the rear chassis subassembly needs to be replaced Use the server software to determine if the rear chassis subassembly needs to be replaced See Detecting and Managing Faults on page 25 for more information 2 Power off the server See Removing Power From the Server on page 50 3 Disconnect the power cords See Disconnect Power Cords on page 53 4 Go to the rear of the server and remov...

Page 185: ...omponents into the replacement rear chassis subassembly once you have replaced the faulty subassembly 5 Go to the front of the server and remove the following components Both processor modules or processor filler modules see Remove a Processor Module or Processor Filler Module on page 60 Main module see Remove the Main Module on page 99 All four power supplies see Remove a Power Supply on page 139...

Page 186: ...e Rear Chassis Subassembly 7 Slide the rear chassis subassembly out and away from the server Related Information Install the Rear Chassis Subassembly on page 187 186 SPARC T8 4 Server Service Manual January 2022 ...

Page 187: ...n module see Install the Main Module on page 102 Both processor modules or processor filler modules see Install a Processor Module or Processor Filler Module on page 63 5 Go to the rear of the server and install the following components Rear I O module see Install the Rear I O Module on page 178 All PCIe carriers or fillers see Install a PCIe Card Carrier on page 170 Verify that you are installing...

Page 188: ...he following Applied power to the server See Connect Power Cords on page 191 Started the system See Power On the Server Oracle ILOM on page 192 2 Log in to Oracle ILOM See Log In to Oracle ILOM Service on page 32 3 Start the faultmagmt shell start SP faultmgmt shell Are you sure you want to start the faultmgmt shell y n y faultmgmtsp 4 Use the fmadm faulty command to determine if the server is ope...

Page 189: ...y Related Information Detecting and Managing Faults on page 25 Rear I O Module LEDs on page 173 Remove the Rear Chassis Subassembly on page 184 Install the Rear Chassis Subassembly on page 187 Servicing the Rear Chassis Subassembly 189 ...

Page 190: ...190 SPARC T8 4 Server Service Manual January 2022 ...

Page 191: ...nent Service Task Reference on page 22 Related Information Identifying Components on page 13 Detecting and Managing Faults on page 25 Preparing for Service on page 43 Connect Power Cords Note Standby power is applied as soon as the power cords are connected Depending on how the firmware is configured the server might boot automatically 1 Locate the AC connectors on the rear of the server See Rear ...

Page 192: ... 1 Check the server power state Type show System power_state System Properties power_state Off 2 If the server is powered off power on the server Type start System Starting System 3 Optional To view server boot output start a host console stream Type start HOST console 4 If you are adding a second processor module return to Server Upgrade Process on page 56 Related Information Connect Power Cords ...

Page 193: ...stem recovery AWG American wire gauge B BMC Baseboard management controller BOB Memory buffer on board C chassis Server enclosure CMA Cable management arm SPARC T7 1 and SPARC T7 2 cable management assembly SPARC T7 4 CMP Chip multiprocessor CRU Customer replaceable unit D DHCP Dynamic Host Configuration Protocol Glossary 193 ...

Page 194: ...uns the Oracle Solaris OS and other applications The term host is used to distinguish the primary computer from the SP See SPM hot pluggable Describes a component that can be replaced with power applied but the component must be prepared for removal hot swappable Describes a component that can be replaced with power applied and no preparation is required I ID PROM Chip that contains system informa...

Page 195: ... used for remote access configuration management See Oracle ILOMand SDM name name space Top level Oracle ILOM target NEBS Network Equipment Building System Netra products only NET MGT Network management port An Ethernet port on the server SP the server module SP and the CMM NIC Network interface card or controller NMI Nonmaskable interrupt NVMe Non volatile memory express controller The optional N...

Page 196: ...VM Server for SPARC Virtualization server for SPARC platforms P PCI Peripheral component interconnect PCIe PCI Express an industry standard bus architecture that supports high bandwidth peripherals and I O devices POST Power on self test PROM Programmable read only memory PSH Predictive self healing S SAS Serial attached SCSI SCC System configuration chip SCC PROM System configuration chip on prog...

Page 197: ...The SPM processes Oracle ILOM commands providing lights out management control of the host See host SSD Solid state drive SSH Secure shell T TIA Telecommunications Industry Association Netra products only Tma Maximum ambient temperature U U S NEC United States National Electrical Code UCP Universal connector port UI User interface UL Underwriters Laboratory Inc UTC Coordinated Universal Time UUID ...

Page 198: ...198 SPARC T8 4 Server Service Manual January 2022 ...

Page 199: ...onfiguration reference AC power connectors 135 DIMMs 70 hard drives 87 power supplies 135 configuring how POST runs 37 customer replaceable components CRUs 46 D DIMMs addresses 73 configuration errors 72 configuration reference 70 identifying 71 installing 81 locating 18 locating faulty using DIMM Fault Remind button 76 using PSH 74 NAC names 73 rank classification 71 removing 78 verifying 84 dmes...

Page 200: ...s subassembly 187 rear I O module 178 SCC PROM 126 SPM 121 K Knowledge Base 25 Knowledge Base articles 33 L LEDs AC power connectors 137 front panel 29 hard drives 89 NET Link and Activity 30 Net Management Link and Activity 30 Net Management Speed 30 NET Speed 30 PCIe carriers 159 power supplies 137 processor modules 58 rear I O module 30 173 SP 30 System Locator 29 30 System Overtemp 29 30 Syste...

Page 201: ...le Solaris PSH checking for faults 33 clearing faults 39 overview 25 Oracle VTS 26 P PCIe cards installation order 158 installing 167 locating faulty 160 removing 164 verifying 171 PCIe carrier extension needed for system cooling by some PCIe cards 170 PCIe carriers installing 170 LEDs 159 locating 15 removing 161 PCIe root complex connections 155 PCIe slots disabled after PM1 failure 158 PM1 fail...

Page 202: ...subassembly 184 rear I O module 176 SCC PROM 125 SPM 118 S safety information and symbols 43 SCC PROM installing 126 removing 125 verifying 128 schematic 40 server connecting power cords 191 locating 49 powering off emergency shutdown 52 gracefully with power button 52 using service processor command 51 powering on usingstart SYS command 192 service categories 46 53 servicing NVMe switch cards 107...

Page 203: ...S 26 UUID 33 V var adm messages file 36 verifying battery 132 DIMMs 84 fan modules 152 hard drives 94 main module 105 NVMe switch cards 114 PCIe cards 171 power supplies 144 processor modules 67 rear I O module 181 188 SCC PROM 128 SPM 124 viewing system message log files 36 203 ...

Page 204: ...204 SPARC T8 4 Server Service Manual January 2022 ...

Reviews: