background image

View Log Files (Oracle ILOM)

more /var/adm/messages*

Related Information

“Check the Message Buffer” on page 36

“View Log Files (Oracle ILOM)” on page 37

View Log Files (Oracle ILOM)

1.

View the event log.

-> 

show /SP/logs/event/list

2.

View the audit log.

-> 

show /SP/logs/audit/list

Related Information

“Check the Message Buffer” on page 36

“View Log Files (Oracle Solaris)” on page 36

Configuring POST

These topics explain how to configure POST as a diagnostic tool.

“POST Overview” on page 37

“Configure POST” on page 38

POST Overview

POST is a group of PROM-based tests that run when the server is powered on or when it is

reset. POST checks the basic integrity of the critical hardware components in the server.

You can also set other Oracle ILOM properties to control various other aspects of POST

operations. For example, you can specify the events that cause POST to run, the level of testing

Detecting and Managing Faults

37

Summary of Contents for SPARC T7-4

Page 1: ...SPARC T7 4 Server Service Manual Part No E54994 07 May 2017 ...

Page 2: ......

Page 3: ...rmation management applications It is not developed or intended for use in any inherently dangerous applications including applications that may create a risk of personal injury If you use this software or hardware in dangerous applications then you shall be responsible to take all appropriate fail safe backup redundancy and other measures to ensure its safe use Oracle Corporation and its affiliat...

Page 4: ...n des informations Ce logiciel ou matériel n est pas conçu ni n est destiné à être utilisé dans des applications à risque notamment dans des applications pouvant causer un risque de dommages corporels Si vous utilisez ce logiciel ou ce matériel dans le cadre d applications dangereuses il est de votre responsabilité de prendre toutes les mesures de secours de sauvegarde de redondance et autres mesu...

Page 5: ...Supported Storage and Backup Devices 22 Component Service Task Reference 22 Detecting and Managing Faults 25 Understanding Diagnostics 25 PSH Overview 25 Diagnostics Process 26 Checking for Faults 27 Interpreting LEDs 27 Log In to Oracle ILOM Service 32 Check for Faults 33 Interpreting Log Files and System Messages 35 Check the Message Buffer 36 View Log Files Oracle Solaris 36 View Log Files Orac...

Page 6: ...r Off the Server 51 Power Off the Server Oracle ILOM 51 Power Off the Server Power Button Graceful Shutdown 52 Power Off the Server Power Button Emergency Shutdown 52 Disconnect Power Cords 53 Attachment of Devices During Service 54 Servicing Processor Modules 55 Server Upgrade Process 56 Processor Module Configuration 57 Processor Module LEDs 59 Determine Which Processor Module Is Faulty 60 Remov...

Page 7: ...rd Drives 87 Hard Drive Configuration 87 Hard Drive LEDs 89 Determine Which Hard Drive Is Faulty 90 Remove a Hard Drive 90 Install a Hard Drive 94 Verify a Hard Drive 95 Servicing the Main Module 99 Main Module LEDs 100 Determine if the Main Module Is Faulty 101 Remove the Main Module 101 Install the Main Module 105 Verify the Main Module 108 Servicing NVMe Switch Cards 111 Disconnect the NVMe Cab...

Page 8: ...Verify the Battery 140 Servicing the Front I O Assembly 143 Remove the Front I O Assembly 143 Install the Front I O Assembly 145 Servicing Power Supplies 147 Power Supply Configuration 147 Power Supply and AC Power Connector LEDs 150 Determine Which Power Supply Is Faulty 151 Remove a Power Supply 151 Install a Power Supply 153 Verify a Power Supply 156 Servicing Fan Modules 157 Fan Module Configu...

Page 9: ...180 Verify a PCIe Card 181 Servicing the Rear I O Module 183 Rear I O Module LEDs 183 Determine if the Rear I O Module Is Faulty 186 Remove the Rear I O Module 186 Install the Rear I O Module 188 Verify the Rear I O Module 190 Servicing the Rear Chassis Subassembly 193 Rear Chassis Subassembly Components 193 Remove the Rear Chassis Subassembly 194 Install the Rear Chassis Subassembly 197 Verify th...

Page 10: ...10 SPARC T7 4 Server Service Manual May 2017 ...

Page 11: ... providers Required knowledge Advanced experience troubleshooting and replacing hardware Product Documentation Library Documentation and resources for this product and related products are available at http www oracle com goto t7 4 docs Feedback Provide feedback about this documentation at http www oracle com goto docfeedback Using This Documentation 11 ...

Page 12: ...12 SPARC T7 4 Server Service Manual May 2017 ...

Page 13: ... Rear Panel Components Service on page 16 Chassis Subassembly Components on page 18 Processor Module Components on page 19 Main Module Components on page 20 Supported Storage and Backup Devices on page 22 Component Service Task Reference on page 22 System Schematic on page 41 Related Information Detecting and Managing Faults Preparing for Service Returning the Server to Operation Identifying Compo...

Page 14: ...onents on page 19 Servicing Processor Modules on page 55 2 Control panel Detecting and Managing Faults on page 25 Preparing for Service on page 43 Returning the Server to Operation on page 201 3 Main module Main Module Components on page 20 Servicing the Main Module on page 99 4 Power supplies 4 Servicing Power Supplies on page 147 14 SPARC T7 4 Server Service Manual May 2017 ...

Page 15: ...rvice on page 16 Chassis Subassembly Components on page 18 Processor Module Components on page 19 Main Module Components on page 20 Supported Storage and Backup Devices on page 22 Component Service Task Reference on page 22 System Schematic on page 41 Identifying Components 15 ...

Page 16: ...s 4 Preparing for Service on page 43 3 Rear I O module Servicing the Rear I O Module on page 183 4 PCIe carriers 16 Servicing PCIe Cards on page 165 These components are accessible within the rear chassis subassembly which you can access after you have removed all the components from the rear of the server 16 SPARC T7 4 Server Service Manual May 2017 ...

Page 17: ...ing the Rear Chassis Subassembly on page 193 Related Information Front Panel Components Service on page 14 Chassis Subassembly Components on page 18 Processor Module Components on page 19 Main Module Components on page 20 Supported Storage and Backup Devices on page 22 Component Service Task Reference on page 22 System Schematic on page 41 Identifying Components 17 ...

Page 18: ... indicators Front Panel Controls and LEDs on page 29 5 Processor modules 2 Servicing Processor Modules on page 55 6 Chassis 7 Rear chassis subassembly RCSA Servicing the Rear Chassis Subassembly on page 193 8 Fan modules 5 Servicing Fan Modules on page 157 9 PCIe carriers 16 Servicing PCIe Cards on page 165 10 Rear I O module Servicing the Rear I O Module on page 183 11 Power supplies 4 Servicing ...

Page 19: ...ponents on page 19 Main Module Components on page 20 Supported Storage and Backup Devices on page 22 Component Service Task Reference on page 22 System Schematic on page 41 Processor Module Components These components are accessible within the processor module when you remove the processor module from the front of the server Identifying Components 19 ...

Page 20: ...e on page 16 Chassis Subassembly Components on page 18 Main Module Components on page 20 Supported Storage and Backup Devices on page 22 Component Service Task Reference on page 22 System Schematic on page 41 Main Module Components These components are accessible after you remove the main module from the front of the server 20 SPARC T7 4 Server Service Manual May 2017 ...

Page 21: ...ne Servicing the Drive Backplane on page 119 4 Main module motherboard 5 SPM Servicing the SPM on page 125 6 SCC PROM Servicing the SCC PROM on page 133 7 Battery Servicing the Battery on page 137 8 NVMe cards optional Servicing NVMe Switch Cards on page 111 Related Information Front Panel Components Service on page 14 Rear Panel Components Service on page 16 Identifying Components 21 ...

Page 22: ... SAS 2 The server also supports these types of tape backup and restore devices TCP IP Fibre channel SAS LVD SCSI Related Information Front Panel Components Service on page 14 Rear Panel Components Service on page 16 Chassis Subassembly Components on page 18 Processor Module Components on page 19 Main Module Components on page 20 Component Service Task Reference on page 22 System Schematic on page ...

Page 23: ...x None Servicing Hard Drives on page 87 SPM 1 SYS MB SPM SPM Servicing the SPM on page 125 SCC PROM 1 SYS MB SCC None Servicing the SCC PROM on page 133 Battery 1 SYS MB BAT None Servicing the Battery on page 137 Front I O assembly 1 SYS FIO None Servicing the Front I O Assembly on page 143 Power supply 4 SYS PSx System Power Power_Supplies Power_Supply_x Servicing Power Supplies on page 147 Fan m...

Page 24: ...Chassis Subassembly Components on page 18 Processor Module Components on page 19 Main Module Components on page 20 Supported Storage and Backup Devices on page 22 System Schematic on page 41 24 SPARC T7 4 Server Service Manual May 2017 ...

Page 25: ... a Fault Manually on page 40 Related Information Identifying Components on page 13 Component Service Categories on page 46 Preparing for Service on page 43 Returning the Server to Operation on page 201 Understanding Diagnostics These topics explain the diagnostic process and tools PSH Overview on page 25 Diagnostics Process on page 26 PSH Overview The PSH feature provides problem diagnosis on the ...

Page 26: ...formation Diagnostics Process on page 26 Checking for Faults on page 27 Diagnostics Process This table describes the diagnostics process Step Diagnostic Action Possible Outcome Links 1 Check the server for detected faults using these tools System LEDs on the front and rear panels fmadm faultycommand from the Oracle Solaris prompt or through the Oracle ILOM fault management shell Determine the faul...

Page 27: ...ods to check for faults Interpreting LEDs on page 27 Log In to Oracle ILOM Service on page 32 Check for Faults on page 33 Interpreting LEDs Use these steps to determine if an LED indicates that a component has failed in the server Steps Description Links 1 Check the LEDs on the front and rear of the server Front Panel Controls and LEDs on page 29 Rear Panel Controls and LEDs on page 31 2 Check the...

Page 28: ...IMMs on page 74 Determine Which Hard Drive Is Faulty on page 90 Determine Which Power Supply Is Faulty on page 151 Determine Which Fan Module Is Faulty on page 158 Determine Which PCIe Card Is Faulty on page 170 Determine if the Rear I O Module Is Faulty on page 186 Related Information Front Panel Controls and LEDs on page 29 Rear Panel Controls and LEDs on page 31 28 SPARC T7 4 Server Service Man...

Page 29: ...e the Server on page 49 2 Server Service Required LED amber The fmadm faulty command provides details about any faults that cause this indicator to light See Check for Faults on page 33 Under some fault conditions individual component fault LEDs are lit in addition to the Server Service Required LED 3 Power OK LED green Indicates these conditions Off Server is not running in its normal state Serve...

Page 30: ...D amber Indicates these conditions Off Indicates a steady state no service action is required Steady on Indicates that a temperature failure event has been acknowledged and a service action is required 6 Fan Module Fault LED amber Rear FM Indicates these conditions Off Indicates a steady state no service action is required Steady on Indicates that a fan module failure event has been acknowledged a...

Page 31: ... MGT port speed LED Indicates these conditions Off The link is operating as a 10 Mbps connection On or blinking The link is operating as a 100 Mbps connection 4 Network port link LED Indicates these conditions Off No link is established Blinking A link is established 5 Network port speed LED Indicates these conditions Off The link is operating as a 10 Mbps connection or there is no link Amber on T...

Page 32: ...l operating state No service actions are required Fast blink Server is running in standby mode and can be quickly returned to full function Slow blink A normal but transitory activity is taking place Slow blinking might indicate that system diagnostics are running or that the system is booting 10 SP LED SP Indicates these conditions Off AC power might have been connected to the power supplies Stea...

Page 33: ...m You can still use the 3 0 legacy names in commands at any time but to expose the legacy names in the output you must enable them This manual uses the legacy names in the command examples and shows the names in the output examples For more information about the new name spaces see the Oracle ILOM documentation Related Information Interpreting LEDs on page 27 Check for Faults on page 33 Check for ...

Page 34: ...ation Name TLA PN NRM M7 1 2 Part_Number 7061001 Revision 01 Serial_Number 465769T 12445102WR Chassis Manufacturer Oracle Corporation Name SPARC T7 4 Part_Number 12345678 13 2 Serial_Number 1248DC140 Description A fault has been diagnosed by the Host Operation System Response The service required LED on the chassis and on the affected FRU may be illuminated Impact No SPM impact Action Refer to the...

Page 35: ...tps support oracle com and search on the message ID in the Knowledge tab 5 Follow the suggested actions to repair the fault 6 If necessary clear the fault manually See Clear a Fault Manually on page 40 Related Information PSH Overview on page 25 Clear a Fault Manually on page 40 Interpreting Log Files and System Messages With the OS running on the server you have the full complement of Oracle Sola...

Page 36: ...y records various system warnings errors and faults in message files These messages can alert you to system problems such as a device that is about to fail The var adm directory contains several message files The most recent messages are in the var adm messages file After a period of time usually every week a new messages file is automatically created The original contents of the messages file are...

Page 37: ...on page 36 Configuring POST These topics explain how to configure POST as a diagnostic tool POST Overview on page 37 Configure POST on page 38 POST Overview POST is a group of PROM based tests that run when the server is powered on or when it is reset POST checks the basic integrity of the critical hardware components in the server You can also set other Oracle ILOM properties to control various o...

Page 38: ...e ILOM Service on page 32 2 Set the virtual keyswitch to the value that corresponds to the POST configuration you want to run This example sets the virtual keyswitch default_level to min which configures POST to run according to other parameter values set HOST keyswitch_state min Set default_level to min For possible values for the keyswitch_state parameter type show HOST diag help HOST diag Manag...

Page 39: ...y Possible values none min normal max hw_change_verbosity User role required for set r Note Depending on the server configuration setting the HOST keyswitch_state diagnostics verbosity to none might result in no POST test status displaying on the console for an extended period of time 3 You can also set the virtual keyswitch to determine the diagnostic level after an error reset and after a hardwa...

Page 40: ...nually 1 After replacing a faulty FRU power on the server See Returning the Server to Operation on page 201 2 At the host prompt determine whether the replaced FRU still shows a faulty state See Check for Faults on page 33 If no fault is reported you do not need to do anything else Do not perform the subsequent steps If a fault is reported continue to Step 3 3 Clear the fault from all persistent f...

Page 41: ...he fault faultmgmtsp exit reset System Are you sure you want to reset System y Resetting System Related Information PSH Overview on page 25 Check for Faults on page 33 System Schematic This schematic shows the connections between and among specific components and device slots You can use this schematic to determine optimum locations for any optional cards or other peripherals based on system confi...

Page 42: ...anel Components Service on page 16 Chassis Subassembly Components on page 18 Processor Module Components on page 19 Main Module Components on page 20 Supported Storage and Backup Devices on page 22 Component Service Task Reference on page 22 42 SPARC T7 4 Server Service Manual May 2017 ...

Page 43: ...r on page 50 8 Gain access to service components Chassis Subassembly Components on page 18 Safety Information For your protection observe the following safety precautions when setting up your equipment Follow all cautions and instructions marked on the equipment and described in the documentation shipped with your server Follow all cautions and instructions marked on the equipment and described in...

Page 44: ...oards and hard drives contain electronic components that are extremely sensitive to static electricity Ordinary amounts of static electricity from clothing or the work environment can destroy the components located on these boards Do not touch the components along their connector edges Caution You must disconnect all power supplies before servicing any of the components that are inside the chassis...

Page 45: ...n page 49 Removing Power From the Server on page 50 Tools Needed for Service You will need the following tools for most service operations Antistatic wrist strap Antistatic mat No 1 Phillips screwdriver No 2 Phillips screwdriver No 1 flat blade screwdriver battery removal Related Information Safety Information on page 43 Tools Needed for Service on page 45 Component Fillers on page 46 Component Se...

Page 46: ...on Safety Information on page 43 Tools Needed for Service on page 45 Component Service Categories on page 46 Find the Server Serial Number on page 47 Locate the Server on page 49 Prevent ESD Damage on page 49 Removing Power From the Server on page 50 Component Service Categories Replaceable components fall into these categories Hot serviceable by the customer Hot serviceable components can be remo...

Page 47: ... page 143 Power supply Off or On Servicing Power Supplies on page 147 Fan module Off or On Servicing Fan Modules on page 157 PCIe card Off or On Servicing PCIe Cards on page 165 Rear I O module Off X Servicing the Rear I O Module on page 183 Rear chassis subassembly Off X Servicing the Rear Chassis Subassembly on page 193 You must disconnect the ower cords before accessing this component Related I...

Page 48: ...keyswitch_state Normal product_name T5 4 product_part_number 602 1234 01 product_serial_number 0723BBC006 fault_state OK clear_fault_action none power_state On Commands cd reset set show start stop Related Information Safety Information on page 43 Tools Needed for Service on page 45 Component Fillers on page 46 Component Service Categories on page 46 Locate the Server on page 49 Prevent ESD Damage...

Page 49: ...ype set SYS LOCATE value Off Related Information Safety Information on page 43 Tools Needed for Service on page 45 Component Fillers on page 46 Component Service Categories on page 46 Find the Server Serial Number on page 47 Prevent ESD Damage on page 49 Removing Power From the Server on page 50 Prevent ESD Damage Many components contained in the processor modules and main module can be damaged by...

Page 50: ...ge 69 Servicing the Main Module on page 99 Servicing the Drive Backplane on page 119 Servicing the SPM on page 125 Servicing the SCC PROM on page 133 Servicing the Battery on page 137 Servicing the Front I O Assembly on page 143 Servicing PCIe Cards on page 165 Servicing the Rear I O Module on page 183 Servicing the Rear Chassis Subassembly on page 193 Removing Power From the Server These topics d...

Page 51: ...er See Power Off the Server Oracle ILOM on page 51 Power Off the Server Power Button Graceful Shutdown on page 52 Power Off the Server Power Button Emergency Shutdown on page 52 Related Information Prepare to Power Off the Server on page 51 Disconnect Power Cords on page 53 Power Off the Server Oracle ILOM You can use the SPM to perform a graceful shutdown of the server This type of shutdown ensur...

Page 52: ...This procedure places the server in the power standby mode 1 Press and release the recessed Power button The Power OK LED blinks rapidly 2 If you are powering off the server in order to add a second processor module return to Server Upgrade Process on page 56 Related Information Power Off the Server Oracle ILOM on page 51 Power Off the Server Power Button Emergency Shutdown on page 52 Power Off th...

Page 53: ...wer Off the Server Oracle ILOM on page 51 Power Off the Server Power Button Graceful Shutdown on page 52 Power Off the Server Power Button Emergency Shutdown on page 52 2 Disconnect all power cords from the server Caution Because standby power is always present in the system you must unplug the power cords before accessing certain components Related Information Safety Information on page 43 Tools ...

Page 54: ...ou plan to connect to the Oracle ILOM software over the network connect an Ethernet cable to the Ethernet port labeled NET MGT Note The SP uses the NET MGT out of band port by default You can configure the SP to share one of the sever s four Ethernet ports instead The SP uses only the configured Ethernet port If you plan to access the Oracle ILOM CLI through the management port connect a serial nu...

Page 55: ...le on page 64 Verify a Processor Module on page 67 Learn the process for upgrading the server from a single processor module configuration to a two processor module configuration Server Upgrade Process on page 56 Remove the processor module as part of another component s service operation Remove a Processor Module or Processor Filler Module on page 60 Install the processor module as part of anothe...

Page 56: ...rom the new processor module Remove a Processor Module or Processor Filler Module on page 60 3 Remove all of the DIMM fillers in the processor module The steps to remove the DIMM fillers are the same as the steps for removing DIMMs Remove a DIMM on page 78 4 Verify that you have the correct DIMMs for your server All of the DIMMs must be either 16 or 32 GB and they must match the size and capacity ...

Page 57: ...e 168 Related Information Processor Module Components on page 19 System Schematic on page 41 Detecting and Managing Faults on page 25 Removing Power From the Server on page 50 Servicing DIMMs on page 69 Processor Module Configuration on page 57 Remove a Processor Module or Processor Filler Module on page 60 Install a Processor Module or Processor Filler Module on page 64 Verify a Processor Module ...

Page 58: ... PM1 or processor filler module 2 Processor Module 0 PM0 Note In servers with two processor modules installed DIMMs configurations in both processor modules must be identical See Understanding DIMM Configurations on page 69 58 SPARC T7 4 Server Service Manual May 2017 ...

Page 59: ... On The server is running and the processor module is functioning correctly Off The server is powered down and the processor module is in standby mode Related Information Processor Module Components on page 19 Server Upgrade Process on page 56 Determine Which Processor Module Is Faulty on page 60 Remove a Processor Module or Processor Filler Module on page 60 Install a Processor Module or Processo...

Page 60: ...replaced 3 Remove the faulty processor module See Remove a Processor Module or Processor Filler Module on page 60 Related Information Processor Module Components on page 19 Processor Module LEDs on page 59 Remove a Processor Module or Processor Filler Module on page 60 Install a Processor Module or Processor Filler Module on page 64 Verify a Processor Module on page 67 Remove a Processor Module or...

Page 61: ...e 43 2 Ensure that the server is powered off See Removing Power From the Server on page 50 3 Disconnect the power cords See Disconnect Power Cords on page 53 4 Locate the processor module in the server that you want to remove If you are replacing a faulty processor module see Determine Which Processor Module Is Faulty on page 60 to locate a faulty processor module If you are adding a processor mod...

Page 62: ...vers in toward the server and pull the extraction levers out to disengage the processor module or processor filler module from the server 6 Pull the processor module or processor filler module halfway out of the server and close the levers 62 SPARC T7 4 Server Service Manual May 2017 ...

Page 63: ...tistatic mat Caution Do not touch the connectors at the rear of the module 8 Determine your next step If you are replacing or installing DIMMs within the processor module see Servicing DIMMs on page 69 If you are replacing a faulty processor module populate and install the replacement processor module a Remove all of the DIMMs from the faulty processor module and set them in a safe place See Remov...

Page 64: ...rocess on page 56 Determine Which Processor Module Is Faulty on page 60 Servicing DIMMs on page 69 Install a Processor Module or Processor Filler Module on page 64 Verify a Processor Module on page 67 Install a Processor Module or Processor Filler Module Processor modules are cold service components that can be replaced only by qualified service personnel For the location of the processor modules ...

Page 65: ...d from the faulty processor module into the replacement module See Install a DIMM on page 80 3 Open the latches on the processor module or processor filler module and insert the module into the empty processor module slot in the server Note A processor filler module can only be installed in slot 1 4 Bring the levers together toward the center of the module and press the levers firmly against the m...

Page 66: ...Verify a Processor Module on page 67 7 If you are adding a second processor module to the server return to Server Upgrade Process on page 56 Related Information Processor Module Components on page 19 Server Upgrade Process on page 56 Processor Module LEDs on page 59 Determine Which Processor Module Is Faulty on page 60 Remove a Processor Module or Processor Filler Module on page 60 Servicing DIMMs...

Page 67: ...he PSH detected fault from the server 2 Verify that the OK LED is lit on the processor module and that the Fault LED is not lit See Processor Module LEDs on page 59 3 Verify that the front and rear Service Required LEDs are not lit See Front Panel Controls and LEDs on page 29 and Rear Panel Controls and LEDs on page 31 4 Perform one of the following tasks based on your verification results If the ...

Page 68: ...s on page 19 Processor Module LEDs on page 59 Determine Which Processor Module Is Faulty on page 60 Remove a Processor Module or Processor Filler Module on page 60 Install a Processor Module or Processor Filler Module on page 64 68 SPARC T7 4 Server Service Manual May 2017 ...

Page 69: ... Description Links Understand how to replace DIMMs Understanding DIMM Configurations on page 69 Identifying DIMMs on page 71 Locate a faulty DIMM DIMM Fault Handling on page 74 Determine Which DIMM Is Faulty PSH on page 75 Determine Which DIMM Is Faulty DIMM Fault LEDs on page 76 DIMM Configuration Errors on page 72 Replace a DIMM Remove a DIMM on page 78 Install a DIMM on page 80 Verify a DIMM on...

Page 70: ...Note The DIMM sparing feature is available only in fully populated servers All DIMMs associated with each CMx must be identical same size same rank classification Mixed configurations are supported DIMMs associated with CM0 with one size and DIMMs associated with CM1 with a different size as long as all DIMMs in the server have the same rank classification For example 32 Gbyte 4Rx4 DIMMs associate...

Page 71: ...se these labels to identify the DIMMs installed in the server to verify that any replacement DIMMs are compatible or to confirm that upgrade DIMMs may be installed in a supported configuration The following DIMMs are supported DIMM Capacity DRAM Density Rank Classification Label 16 Gbyte 4 Gbit Dual rank x4 2Rx4 32 Gbyte 4 Gbit Quad rank x4 4Rx4 32 Gbyte 8 Gbit Dual rank x4 2Rx4 64 Gbyte 8 Gbit Qu...

Page 72: ...ch as the following is displayed WARNING Running with a nonstandard DIMM configuration Refer to service document for details In other cases the configuration error is fatal and the following message is displayed Fatal configuration error forcing power down In addition to these general memory configuration errors one or more rule specific messages is displayed indicating the type of configuration e...

Page 73: ...CM0 BOB20 CH1 CM0 BOB00 CH0 CM0 BOB00 CH1 CM0 BOB30 CH1 CM0 BOB30 CH0 CM0 BOB10 CH1 CM0 BOB10 CH0 CM0 BOB31 CH0 CM0 BOB31 CH1 CM0 BOB11 CH0 CM0 BOB11 CH1 CM0 DIMM NAC names are based both on the location of the DIMM slot on the processor module and in which slot the processor module is installed For example the full NAC name for the DIMM installed in the front left corner on a processor module ins...

Page 74: ...u must replace the faulty DIMMs based on the fault message and enable the disabled DIMMs with the Oracle ILOM command set devicecomponent_state enabled where device is the name of the DIMM being enabled PSH technology The Oracle PSH feature uses the Fault Manager daemon fmd to watch for various kinds of faults When a fault occurs the fault is assigned a UUID and logged PSH reports the fault and su...

Page 75: ...page 76 Determine Which DIMM Is Faulty PSH The Oracle Fault Management tool fmadm faulty displays current server faults including DIMM failures 1 Start the Fault Management Shell start SP faultmgmt shell Are you sure you want to start SP faultmgmt shell y n y 2 Type faultmgmtsp fmadm faulty Time UUID msgid Severity 2014 08 18 21 04 40 7040d859 5b03 4a58 8dfd e3a80875d62f SPSUN4V 8000 EJ Critical P...

Page 76: ...rt Impact Total system memory capacity has been reduced and some applications may have been terminated Action Use fmadm faulty to provide a more detailed view of this event Please refer to the associated reference document at http support oracle com msg SPSUN4V 8000 EJ for the latest service procedures and policies regarding this diagnosis Related Information Determine Which DIMM Is Faulty Oracle ...

Page 77: ...age 69 Prepare the system for service See Preparing for Service on page 43 Remove the processor module containing the faulty DIMM Place the processor module on an ESD protect work surface Remove the processor module cover See Remove a Processor Module or Processor Filler Module on page 60 2 Locate the DIMM Fault Remind button on the processor module Servicing DIMMs 77 ...

Page 78: ...y Oracle ILOM on page 75 Determine Which DIMM Is Faulty PSH on page 75 Remove a DIMM DIMMs are cold service components that can be replaced by customers For the location of the DIMMs see Processor Module Components on page 19 Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Consider your...

Page 79: ...y DIMM Fault LEDs on page 76 4 Push down on the ejector tabs on each side of the DIMM until the DIMM is released Caution DIMMs and heat sinks on the motherboard might be hot 5 Grasp the top corners of the faulty DIMM and lift it out of its slot 6 Place the DIMM on an antistatic mat 7 Repeat Step 4 through Step 6 for any other DIMMs you intend to remove 8 Determine your next step Servicing DIMMs 79...

Page 80: ...e 69 Understanding DIMM Configurations on page 69 Determine Which DIMM Is Faulty DIMM Fault LEDs on page 76 Determine Which DIMM Is Faulty PSH on page 75 Install a DIMM on page 80 Verify a DIMM on page 83 Install a DIMM DIMMs are cold service components that can be replaced by customers For the location of the DIMMs see Processor Module Components on page 19 Caution This procedure requires that yo...

Page 81: ...MMs on page 74 See Remove a DIMM on page 78 If you are adding DIMMs to a half populated processor module Ensure you have the correct DIMMs for your server See Identifying DIMMs on page 71 If you are populating a new processor module Ensure you have the correct DIMMs for your server See Understanding DIMM Configurations on page 69 3 Unpack the replacement DIMMs and place them on an antistatic mat 4...

Page 82: ...does not easily seat into the connector check the DIMM s orientation 7 Repeat Step 4 through Step 6 until all new DIMMs are installed 8 Place the cover onto the processor module and slide the cover forward until the latch clicks into place 9 Consider your next steps If you are adding a second processor module to the server return to Server Upgrade Process on page 56 82 SPARC T7 4 Server Service Ma...

Page 83: ...M on page 83 DIMM Fault Handling on page 74 DIMM Configuration Errors on page 72 Verify a DIMM 1 Access the Oracle ILOM prompt Refer to the SPARC T7 Series Server Administration Guide for instructions 2 Use the show faulty command to determine how to clear the fault If show faulty indicates a POST detected fault go to Step 3 If show faulty output displays a UUID which indicates a host detected fau...

Page 84: ... the host has been powered off The console will display status Powered Off Allow approximately one minute before running this command d Switch to the system console to view POST output Watch the POST output for possible fault messages The following output indicates that POST did not detect any faults start HOST console 0 0 0 INFO 0 0 0 POST Passed all devices 0 0 0 POST Return to VBSC 0 0 0 Master...

Page 85: ...43 59 faults 0 If the show faulty command reports a fault with a UUID go on to Step 7 If show faulty does not report a fault with a UUID you are done with the verification process 7 Switch to the system console and use the fmadm repair command with the UUID Use the same UUID that was displayed from the output of the Oracle ILOM show faulty command For example fmadm repair 3aa7c854 9667 e176 efe5 e...

Page 86: ...86 SPARC T7 4 Server Service Manual May 2017 ...

Page 87: ...ver components These topics describe service procedures for the hard drives in the server Hard Drive Configuration on page 87 Hard Drive Configuration on page 87 Hard Drive LEDs on page 89 Determine Which Hard Drive Is Faulty on page 90 Remove a Hard Drive on page 90 Install a Hard Drive on page 94 Verify a Hard Drive on page 95 Hard Drive Configuration You can install a mix of hard drives and sol...

Page 88: ...vents any applications from accessing it and removes logical software links to it You cannot hot service a drive in the following situations If the drive contains the operating system and the operating system is not mirrored on another drive If the drive cannot be logically isolated from the online operations of the server If either of these conditions apply to the drive being serviced you must ta...

Page 89: ...at the drive has experienced a fault condition 3 OK Activity green Indicates the drive s availability for use On Read or write activity is in progress Off Drive is idle and available for use Related Information Hard Drive Configuration on page 87 Hard Drive Configuration on page 87 Determine Which Hard Drive Is Faulty on page 90 Remove a Hard Drive on page 90 Install a Hard Drive on page 94 Verify...

Page 90: ...emove the faulty drive See Remove a Hard Drive on page 90 Related Information Hard Drive Configuration on page 87 Hard Drive Configuration on page 87 Hard Drive LEDs on page 89 Remove a Hard Drive on page 90 Install a Hard Drive on page 94 Verify a Hard Drive on page 95 Remove a Hard Drive Hard drives are hot service components that can be replaced by customers For the location of the hard drives ...

Page 91: ...t are not configured cfgadm al This command lists dynamically reconfigurable hardware resources and shows their operational status In this case look for the status of the drive you plan to remove This information is listed in the Occupant column Example Ap_id Type Receptacle Occupant Condition c2 scsi sas connected configured unknown c2 w5000cca00a76d1f5 0 disk path connected configured unknown c3...

Page 92: ...ple You can use this same command to check the state of the drive at other stages of the removal procedure b Disable the NVMe drive hotplug disable SYS DBP NVME0 Check that the drive s state has changed from ENABLED to POWERED hotplug list lc c Power down the NVMe drive hotplug poweroff SYS DBP NVME0 Check that the drive s state has changed from POWERED to PRESENT hotplug list lc In this state the...

Page 93: ...drive from the server Caution The latch is not an ejector Do not force the latch too far to the right Doing so can damage the latch 6 After you remove an NVMe drive check that the drive slot s state has changed to EMPTY hotplug list lc 7 Install the replacement drive or a filler tray Servicing Hard Drives 93 ...

Page 94: ...s that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Align the replacement drive to the drive slot and slide the drive in until it is seated Drives are physically addressed according to the slot in which they are installed If you are replacing a drive install the replacement drive in the same slot as the drive that was r...

Page 95: ...ure the hard drive If you hot serviced an NVMe drive it should automatically power up and attach If not power up and attach the drive manually hotplug enable SYS DBP NVME0 Check that the drive s state has changed to ENABLED hotplug list lc 2 If the OS is shut down and the drive you replaced was not the boot device boot the OS Depending on the nature of the replaced drive you might need to perform ...

Page 96: ...ve that you installed See Hard Drive LEDs on page 89 6 At the Oracle Solaris prompt type the cfgadm al command to list all drives in the device tree including any drives that are not configured cfgadm al The replacement drive is now listed as configured For example Ap_id Type Receptacle Occupant Condition c2 scsi sas connected configured unknown c2 w5000cca00a76d1f5 0 disk path connected configure...

Page 97: ...tasks are covered in the Oracle Solaris OS administration documentation For additional drive verification you can run the Oracle VTS software Refer to the Oracle VTS documentation for details Related Information Determine Which Hard Drive Is Faulty on page 90 Remove a Hard Drive on page 90 Install a Hard Drive on page 94 Servicing Hard Drives 97 ...

Page 98: ...98 SPARC T7 4 Server Service Manual May 2017 ...

Page 99: ...ule is faulty Main Module LEDs on page 100 2 Prepare the server for service Preparing for Service on page 43 3 Remove the main module Remove the Main Module on page 101 4 Service main module components Servicing NVMe Switch Cards on page 111 Servicing the Drive Backplane on page 119 Servicing the SPM on page 125 Servicing the SCC PROM on page 133 Servicing the Battery on page 137 Servicing the Fro...

Page 100: ...2 Power OK LED green Indicates these conditions Off System is not running in its normal state System power might be off The SPM might be running Steady on System is powered on and is running in its normal operating state No service actions are required Fast blink System is running in standby mode and can be quickly returned to full function Slow blink A normal but transitory activity is taking pla...

Page 101: ...on the main module The Service Required LED is lit when the server detects a main module fault Related Information Main Module LEDs on page 100 Remove the Main Module on page 101 Install the Main Module on page 105 Remove the Main Module 1 Optional If you are replacing a faulty main module you must back up ILOM configuration settings a Configure the SER MGT port to enable the configuration paramet...

Page 102: ...onnect Power Cords on page 53 4 Locate the main module in the server See Front Panel Components Service on page 14 5 Squeeze the release latches together on the two extraction levers and pull the extraction levers out to disengage the main module from the server 102 SPARC T7 4 Server Service Manual May 2017 ...

Page 103: ...the server Caution Due to the weight of the main module the following step requires two people to perform Do not attempt to lift the main module alone 8 Remove the main module completely from the server 9 Consider your next steps If you have removed the main module to prepare the server for installation return to Preparing for Installation in SPARC T7 4 Server Installation Guide Servicing the Main...

Page 104: ...over back and up off the main module 10 Determine your next step If you are replacing a main module due to a faulty motherboard remove all of the internal components and transfer them to the new main module If you are replacing a component inside the main module use one of the following links Servicing the SPM on page 125 Servicing the Battery on page 137 104 SPARC T7 4 Server Service Manual May 2...

Page 105: ...ng the Drive Backplane on page 119 Related Information Main Module Components on page 20 Main Module LEDs on page 100 Install the Main Module on page 105 Install the Main Module 1 Place the cover back onto the main module and slide the cover forward until the latch clicks into place Servicing the Main Module 105 ...

Page 106: ...ople to perform Do not attempt to lift the main module alone 3 Insert the main module into its slot in the server until the levers begin to engage 4 Press the levers back together toward the center of the module then press the levers firmly against the module to fully seat the module back into the server 106 SPARC T7 4 Server Service Manual May 2017 ...

Page 107: ...r workstation to the SER MGT port The following message is delivered over the serial management port Unrecognized Chassis This module is installed in an unknown or unsupported chassis You must upgrade the firmware to a newer version that supports this chassis 7 Download the system firmware a Configure the SER MGT port to enable the firmware image to be downloaded Refer to the Oracle ILOM documenta...

Page 108: ...ain Module Components on page 20 Main Module LEDs on page 100 Remove the Main Module on page 101 Verify the Main Module 1 Verify that the main module Service Required LED is not lit See Main Module LEDs on page 100 2 Verify that the front and rear System Service Required LEDs are not lit See Front Panel Controls and LEDs on page 29 and Rear Panel Controls and LEDs on page 31 3 Consider these optio...

Page 109: ...Verify the Main Module Remove the Main Module on page 101 Install the Main Module on page 105 Servicing the Main Module 109 ...

Page 110: ...110 SPARC T7 4 Server Service Manual May 2017 ...

Page 111: ...nstalled in the main module If you are replacing a faulty main module you must remove the NVMe switch cards to transfer them to the new main module Part Description 1 NVMe Switch 2 SYS MB PCIE2 PCIESW 2 NVMe Switch 1 SYS MB PCIE1 PCIESW Servicing NVMe Switch Cards 111 ...

Page 112: ...les on page 117 Verify a NVMe Switch Card on page 117 Disconnect the NVMe Cables 1 Remove the main module See Remove the Main Module on page 101 2 Determine your next step If you are replacing a faulty NVMe switch card unplug the NVMe cables from the card If you are moving the NVMe switch cards to a new main module unplug the cables from the backplane If you are replacing the NVMe cables unplug th...

Page 113: ...he cable connectors so can install them correctly 3 Remove the NVMe switch card See Remove a NVMe Switch Card on page 113 Remove a NVMe Switch Card 1 Identify which NVMe switch card you want to remove 2 Unlock the card Servicing NVMe Switch Cards 113 ...

Page 114: ...from the card bracket 3 Push the card away from its connector on the motherboard and lift the card out of the main module Install a NVMe Switch Card 1 Align the NVMe switch card with its connector on the motherboard 114 SPARC T7 4 Server Service Manual May 2017 ...

Page 115: ...Install a NVMe Switch Card Note Insert the rear edge of the NVMe switch card into the corresponding tab on the motherboard 2 Insert the card into its connector Servicing NVMe Switch Cards 115 ...

Page 116: ...Install a NVMe Switch Card The card is inserted laterally into the motherboard connector 3 Lock the card Rotate the retention lever toward the card bracket 116 SPARC T7 4 Server Service Manual May 2017 ...

Page 117: ...ain module See Install the Main Module on page 105 Verify a NVMe Switch Card 1 Use the Oracle ILOM fault management shell to determine if the replacement NVMe switch card is shown as enabled or disabled start SP faultmgmt shell Are you sure you want to start SP faultmgmt shell y n y faultmgmtsp fmadm faulty Servicing NVMe Switch Cards 117 ...

Page 118: ...d LEDs are not lit See Front Panel Controls and LEDs on page 29 and Rear Panel Controls and LEDs on page 31 3 Consider these options If the previous steps did not clear the fault see Diagnostics Process on page 26 If Step 1 and Step 2 indicate that no faults have been detected then the processor module has been replaced successfully No further action is required Related Information Main Module Com...

Page 119: ... authorized service personnel Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Power off the server See Removing Power From the Server on page 50 2 Remove all the hard drives from the front of the server Note the locations of the hard drives before removing them so that you can install t...

Page 120: ...ct the four drive backplane cables from the drive backplane 1 Data cables 2 2 Power cables 2 a Unplug the data cables from the drive backplane b Unplug the power cables from the drive backplane 120 SPARC T7 4 Server Service Manual May 2017 ...

Page 121: ...n module then lift the drive backplane up and remove it from the main module Related Information Install the Drive Backplane on page 121 Install the Drive Backplane Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Take the necessary ESD precautions See Prevent ESD Damage on page 49 2 Pos...

Page 122: ...ath the metal mounting studs on the hard drive assembly 5 Press on the press point on the retaining panel to secure it to the top of the hard drive assembly 6 Connect the drive backplane cables to the drive backplane and the motherboard a Connect the data cable to the drive backplane and the motherboard 122 SPARC T7 4 Server Service Manual May 2017 ...

Page 123: ...nto the server See Install the Main Module on page 105 8 Install the hard drives into the main module Refer to the notes that you took when removing the hard drives to install them back into their original slots See Install a Hard Drive on page 94 9 Power on the server See Returning the Server to Operation on page 201 Related Information Remove the Drive Backplane on page 119 Servicing the Drive B...

Page 124: ...124 SPARC T7 4 Server Service Manual May 2017 ...

Page 125: ...all the Main Module on page 105 5 Verify the replacement SPM Verify the SPM on page 131 Determine if the SPM Is Faulty The following LEDs are illuminated when a SPM fault is detected System Service Required LEDs on the front panel and rear I O module Server SP LED on the main module or rear I O module 1 Determine if the Server Service Required LEDs are illuminated on the front panel or the rear I ...

Page 126: ...that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Back up the SPM configuration information before removing the SPM At the Oracle ILOM prompt type cd SP config dump destination uri target where The acceptable values for uri are tftp ftp sftp scp http https target is the remote location where you want to store the configuration information For exa...

Page 127: ...3 Remove the main module from the server See Remove the Main Module on page 101 4 Locate the SPM on the main module See Main Module Components on page 20 5 Grasp the SPM by the two grasp points and lift up to disengage the SPM from the connectors on the motherboard Servicing the SPM 127 ...

Page 128: ...he SPM on page 128 Verify the SPM on page 131 Install the SPM Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Take the necessary ESD precautions See Prevent ESD Damage on page 49 128 SPARC T7 4 Server Service Manual May 2017 ...

Page 129: ... to the SER MGT port If the replacement SPM detects that the SPM firmware is not compatible with the existing host firmware further action is suspended and the following message is delivered over the SER MGT port Unrecognized Chassis This module is installed in an unknown or unsupported chassis You must upgrade the firmware to a newer version that supports this chassis If you see this message go o...

Page 130: ... configuration information that you backed up earlier At the Oracle ILOM prompt type cd SP config load source uri target where The acceptable values for uri are tftp ftp sftp scp http https target is the remote location where you stored the configuration information For example load source tftp 129 99 99 99 pathname 8 If TPM was initialized on the replaced SPM complete these steps a Reinitialize T...

Page 131: ...ne GMT GMT uptime 0 days 00 01 18 usentpserver disabled a Set the datetime property if necessary set SP clock datetime MMDDhhmmYYYY b Set the timezone property if necessary set SP clock timezone 3 to 4 characters where the timezone value equals a three or four character timezone abbreviation such as EST Related Information Determine if the SPM Is Faulty on page 125 Remove the SPM on page 126 Verif...

Page 132: ... did not clear the fault see Diagnostics Process on page 26 If the previous steps indicate that no faults have been detected then the SPM has been replaced successfully No further action is required Related Information Determine if the SPM Is Faulty on page 125 Remove the SPM on page 126 Install the SPM on page 128 132 SPARC T7 4 Server Service Manual May 2017 ...

Page 133: ...dule on page 101 2 Replace the SCC PROM Remove the SCC PROM on page 133 Install the SCC PROM on page 134 3 Install the main module Install the Main Module on page 105 4 Verify the SCC PROM Verify the SCC PROM on page 136 Remove the SCC PROM The SCC PROM is a cold service component that can be replaced only by authorized service personnel To identify and locate the SCC PROM see Processor Module Com...

Page 134: ...See Main Module Components on page 20 4 Grasp the SCC PROM and lift it up to remove it from the main module Related Information Install the SCC PROM on page 134 Install the SCC PROM Before beginning this procedure ensure that you are familiar with the cautions and safety instructions described in Safety Information on page 43 134 SPARC T7 4 Server Service Manual May 2017 ...

Page 135: ...CC PROM properly onto the main module 2 Press down on the SCC PROM until it is completely seated on the main module 3 Insert the main module back into the server See Install the Main Module on page 105 4 Return the server to operation See Returning the Server to Operation on page 201 Related Information Remove the SCC PROM on page 133 Verify the SCC PROM on page 136 Servicing the SCC PROM 135 ...

Page 136: ...dditional verification run specific commands to display data stored in the SCC PROM Use the Oracle ILOM show command to display the MAC address show HOST macaddress HOST Properties macaddress Use Oracle Solaris OS commands to display the hostid and Ethernet address hostid 8534299c ifconfig a lo0 flags 2001000849 UP LOOPBACK RUNNING MULTICAST IPv4 VIRTUAL mtu 8232 index 1 inet 127 0 0 1 netmask ff0...

Page 137: ...y 1 Prepare the host for battery replacement To correctly reset the date and time before replacing a battery you must revent the host from automatically powering on and disable any NTP connections a Check the HOST_AUTO_POWER_ON property show SP policy HOST_AUTO_POWER_ON Properties HOST_AUTO_POWER_ON enabled b If enabled set the HOST_AUTO_POWER_ON property to disabled set SP policy HOST_AUTO_POWER_...

Page 138: ...ocedure a Prepare the server for service b Remove the main module from the server See Remove the Main Module on page 101 c Locate the battery in the main module See Main Module Components on page 20 d Remove the old battery Pinch the battery between two fingers and slide it up and out of the battery holder e Unpack and install the new battery 138 SPARC T7 4 Server Service Manual May 2017 ...

Page 139: ... Module on page 105 g Return the Server to Operation 3 Reset the system clock a Use the Oracle ILOM clock command to reset the system clock The following example sets the date to August 22 2016 and the timezone to EDT set SP clock datetime 081221302016timezone EDT Set datetime to 081221302016 set timezone to EDT show d properties SP clock Properties datetime Mon Aug 22 13 20 16 2016 Servicing the ...

Page 140: ...ts on page 25 Verify the Battery 1 Run show SYS MB V_BAT to check the status of the system battery In the output the SYS MB BAT status should be OK as in the following example show SYS MB BAT Targets Properties type Battery ipmi_name MB BAT class Threshold Sensor value 3 140 Volts upper_nonrecov_threshold N A upper_critical_threshold N A upper_noncritical_threshold N A lower_noncritical_threshold ...

Page 141: ...Verify the Battery Related Information Replace the Battery on page 137 Servicing the Battery 141 ...

Page 142: ...142 SPARC T7 4 Server Service Manual May 2017 ...

Page 143: ...all the Front I O Assembly on page 145 Remove the Front I O Assembly The front I O assembly is a cold service component that can be replaced only by authorized service personnel For the location of this component see Main Module Components on page 20 Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause electronic componen...

Page 144: ...e connector aside to access the captive screw that secures the assembly to the motherboard panel 2 Loosen the screw to release the lower end of the assembly c Gently pull the front I O assembly toward the back of the main module until the ports at the front of the assembly clear the front of the main module and then remove the front I O assembly from the main module panel 3 144 SPARC T7 4 Server S...

Page 145: ...ssary ESD precautions See Prevent ESD Damage on page 49 2 Insert the front I O assembly into position in the main module a Gently slide the front I O assembly into position with the ports inserted into the port holes in the front of the main module panel 1 b Lower the rear of the front I O assembly so that the captive screw is aligned with the screw hole on the motherboard and tighten the screw pa...

Page 146: ...Install the Front I O Assembly Related Information Main Module Components on page 20 Remove the Front I O Assembly on page 143 146 SPARC T7 4 Server Service Manual May 2017 ...

Page 147: ...andle components that are sensitive to electrostatic discharge This discharge can cause failure of server components These topics describe service procedures for the power supplies in the server Power Supply Configuration on page 147 Power Supply and AC Power Connector LEDs on page 150 Determine Which Power Supply Is Faulty on page 151 Remove a Power Supply on page 151 Install a Power Supply on pa...

Page 148: ...r Supply Configuration 1 Power supply 0 PS0 2 Power supply 1 PS1 3 Power supply 2 PS2 4 Power supply 3 PS3 Power cords are accessed from the rear of the server 148 SPARC T7 4 Server Service Manual May 2017 ...

Page 149: ...or power supply 1 PS1 4 Connector for power supply 0 PS0 Related Information Power Supply and AC Power Connector LEDs on page 150 Determine Which Power Supply Is Faulty on page 151 Remove a Power Supply on page 151 Install a Power Supply on page 153 Verify a Power Supply on page 156 Servicing Power Supplies 149 ...

Page 150: ...erver detects a power supply fault 2 OK green Lights when the power supply DC voltage from the PSU to the server is within tolerance 3 AC Present green AC Lights when AC voltage is applied to the power supply Each AC power connector has a single LED that is located on the rear I O module See Interpreting LEDs on page 27 Related Information Power Supply Configuration on page 147 Determine Which Pow...

Page 151: ...and AC Power Connector LEDs on page 150 The amber Service Required LED is lit on the power supply that needs to be replaced 3 Remove the faulty power supply See Remove a Power Supply on page 151 Related Information Power Supply Configuration on page 147 Power Supply and AC Power Connector LEDs on page 150 Remove a Power Supply on page 151 Install a Power Supply on page 153 Verify a Power Supply on...

Page 152: ...the server and locate the AC power connector at the rear of the server that supplies power to the faulty power supply See Power Supply Configuration on page 147 3 Disconnect that power cord 4 Go to the front of the server and on the power supply to be removed squeeze the release latches together then pull the extraction lever toward you to disengage the power supply from the server 152 SPARC T7 4 ...

Page 153: ...pply Is Faulty on page 151 Install a Power Supply on page 153 Verify a Power Supply on page 156 Install a Power Supply The power supply is a hot service component that can be replaced by a customer Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Open the latch on the replacement power s...

Page 154: ...Install a Power Supply Verify that the power supply is oriented as shown in the following figure 2 Slide the power supply into the chassis 154 SPARC T7 4 Server Service Manual May 2017 ...

Page 155: ...he power supply that you just installed 5 Verify the power supply See Verify a Power Supply on page 156 Related Information Power Supply Configuration on page 147 Power Supply and AC Power Connector LEDs on page 150 Determine Which Power Supply Is Faulty on page 151 Remove a Power Supply on page 151 Verify a Power Supply on page 156 Servicing Power Supplies 155 ...

Page 156: ...ese options If the previous steps did not clear the fault see Diagnostics Process on page 26 If Step 1 and Step 2 indicate that no faults have been detected then the power supply has been replaced successfully No further action is required Related Information Power Supply Configuration on page 147 Power Supply and AC Power Connector LEDs on page 150 Determine Which Power Supply Is Faulty on page 1...

Page 157: ...our or five fan modules are operational These topics describe service procedures for the fan modules in the server Fan Module Configuration on page 157 Fan Module LED on page 158 Determine Which Fan Module Is Faulty on page 158 Remove a Fan Module on page 159 Install a Fan Module on page 161 Verify a Fan Module on page 162 Fan Module Configuration Servicing Fan Modules 157 ...

Page 158: ... page 159 Install a Fan Module on page 161 Verify a Fan Module on page 162 Fan Module LED Each fan module has a single Service Required LED Determine Which Fan Module Is Faulty The following LEDs are illuminated when a fan module fault is detected System Service Required LEDs on the front panel and rear I O module Server Fan Fail LED on the front panel 158 SPARC T7 4 Server Service Manual May 2017...

Page 159: ...e Remove a Fan Module on page 159 Related Information Remove a Fan Module on page 159 Install a Fan Module on page 161 Verify a Fan Module on page 162 Remove a Fan Module The fan module is a hot service component that can be replaced by a customer Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server comp...

Page 160: ...ver before removing a fan module If you can remove a fan module with the server running go to Step 3 If you cannot remove a fan module with the server running see Removing Power From the Server on page 50 to power down the server before continuing 3 Press the green button to disengage the fan module from the chassis 160 SPARC T7 4 Server Service Manual May 2017 ...

Page 161: ...e on page 161 DIMM Configuration Errors on page 72 Install a Fan Module The fan module is a hot service component that can be replaced by a customer Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Insert the fan module into the empty fan module slot Servicing Fan Modules 161 ...

Page 162: ...01 to power on the server again 3 Verify the fan module functionality See Verify a Fan Module on page 162 Related Information Determine Which Fan Module Is Faulty on page 158 Remove a Fan Module on page 159 Verify a Fan Module on page 162 Verify a Fan Module 1 Ensure that you have completed the following Applied power to the server See Connect Power Cords on page 201 Powered on the server 162 SPAR...

Page 163: ...he actions described in Diagnostics Process on page 26 3 Log in to Oracle ILOM See Log In to Oracle ILOM Service on page 32 4 Start the faultmgmt shell start SP faultmgmt shell Are you sure you want to start the faultmgmt shell y n y faultmgmtsp 5 Use the fmadm faulty command to check for faults If faults are reported see Diagnostics Process on page 26 If no faults are reported then the fan module...

Page 164: ...164 SPARC T7 4 Server Service Manual May 2017 ...

Page 165: ...age 170 Remove a PCIe Card Carrier on page 171 Remove a PCIe Card on page 174 Install a PCIe Card on page 177 Install a PCIe Card Carrier on page 180 Verify a PCIe Card on page 181 Understanding PCIe Root Complex Connections All 16 PCIe slots support PCIe cards with the following characteristics Hot plug low profile adapters x8 Gen1 x8 Gen2 and x8 Gen3 cards In addition the following PCIe slots su...

Page 166: ...th the root complex Understanding the relationship of the PCIe root complexes to the PCIe I O fabrics will help you properly assign devices when configuring Oracle VM Server for SPARC logical domains This diagram illustrates the root complex connections between the four CPUs and the 16 PCIe I O slots Each CPU supports all I O root complex fabrics In single PM configurations all 166 SPARC T7 4 Serv...

Page 167: ...pci 1 MR1 3 0 x16 Slot 8 pci 30a pci 1 MR1 2 0 x16 Slot 9 pci 30b pci 2 MR2 0 1 x8 Slot 10 pci 30b pci 1 MR2 0 0 x8 Slot 11 pci 30c pci 1 MR2 3 0 x16 Slot 12 pci 30d pci 1 MR3 1 0 x16 Slot 13 pci 30e pci 2 MR3 0 1 x8 Slot 14 pci 30e pci 1 MR3 0 0 x8 Slot 15 pci 30f pci 1 MR3 3 0 x16 Slot 16 pci 310 pci 1 MR3 2 0 x16 If you are reviewing root complex changes after adding a second processor module r...

Page 168: ... efficient For example you may distribute PCIe cards evenly across available root complexes If you are reviewing PCIe installation order after adding a second processor module return to Server Upgrade Process on page 56 Related Information System Schematic on page 41 Understanding PCIe Root Complex Connections on page 165 PCIe Carrier Handle and LEDs on page 169 Determine Which PCIe Card Is Faulty...

Page 169: ...ss this button again to bring the PCIe card online 3 OK green Indicates the following conditions Off The server is powered off or the PCIe card is not operating You can remove the PCIe card or install a new card On The PCIe card is connected and online Do not insert or remove the card Blinking The PCIe card is powering up or powering down Do not insert or remove the card Servicing PCIe Cards 169 ...

Page 170: ... on page 170 Remove a PCIe Card Carrier on page 171 Remove a PCIe Card on page 174 Install a PCIe Card on page 177 Install a PCIe Card Carrier on page 180 Verify a PCIe Card on page 181 Determine Which PCIe Card Is Faulty The following LEDs are illuminated when a fault is detected System Service Required LEDs on the front panel and rear I O module System PCIe Fault LED on the front panel Service R...

Page 171: ...he SPARC T7 4 server supports single wide and double wide card carriers The removal steps are the same for both carrier types This topic includes illustrations only for the single wide carrier Note If you are installing a PCIe card that requires a double wide carrier you must remove two adjacent PCIe card carriers Caution This procedure requires that you handle components that are sensitive to ele...

Page 172: ...e the hotplug command to bring the card offline a List all devices in the device tree including PCIe cards hotplug list cv This command lists dynamically reconfigurable hardware resources and shows their operational status In this case look for the status of the PCIe card you plan to remove This information is listed in the State column For example hotplug list cv Connection State Description ____...

Page 173: ... al This command lists dynamically reconfigurable hardware resources and shows their operational status In this case look for the status of the PCIe card you plan to remove This information is listed in the Occupant column For example Ap_id Type Receptacle Occupant Condition PCI EM0 sas hp connected configured ok PCI EM1 sas hp connected configured ok b Take the PCIe card offline cfgadm c disconne...

Page 174: ... 169 Determine Which PCIe Card Is Faulty on page 170 Remove a PCIe Card on page 174 Install a PCIe Card on page 177 Install a PCIe Card Carrier on page 180 Verify a PCIe Card on page 181 Remove a PCIe Card Caution This procedure requires that you handle components that are sensitive to electrostatic discharge This discharge can cause failure of server components 1 Ensure that you have already take...

Page 175: ...Remove a PCIe Card 2 Unlatch and open the PCIe card carrier top cover Servicing PCIe Cards 175 ...

Page 176: ...ion Understanding PCIe Root Complex Connections on page 165 PCIe Card Configuration on page 168 PCIe Carrier Handle and LEDs on page 169 Determine Which PCIe Card Is Faulty on page 170 Remove a PCIe Card Carrier on page 171 Install a PCIe Card on page 177 Install a PCIe Card Carrier on page 180 Verify a PCIe Card on page 181 176 SPARC T7 4 Server Service Manual May 2017 ...

Page 177: ...ge can cause failure of server components 1 Determine your first step If you are installing a new PCIe card and need an empty PCIe card carrier see Remove a PCIe Card Carrier on page 171 If you are replacing a faulty PCIe card and have already removed its carrier from the server go to Step 2 2 Remove the PCIe card from its packaging Servicing PCIe Cards 177 ...

Page 178: ...ier s connector Caution Do not twist or turn the PCIe card as you insert it into the PCIe card carrier Ensure that the PCIe card s connector is fully seated in the PCIe card carrier s slot and that the notch in the PCIe card s rear bulkhead is seated around the PCIe carrier s alignment tab 178 SPARC T7 4 Server Service Manual May 2017 ...

Page 179: ...mplex Connections on page 165 PCIe Card Configuration on page 168 PCIe Carrier Handle and LEDs on page 169 Determine Which PCIe Card Is Faulty on page 170 Remove a PCIe Card Carrier on page 171 Remove a PCIe Card on page 174 Install a PCIe Card Carrier on page 180 Verify a PCIe Card on page 181 Servicing PCIe Cards 179 ...

Page 180: ...er be powered off or booted into the Oracle Solaris OS 1 Insert the PCIe card carrier into the card cage until it stops Caution Do not press on the PCIe back panel or force the PCIe card carrier into the card cage 2 Close the PCIe carrier handle Rotate the handle up until it latches into place 3 Reconnect the cables to the PCIe card 4 Determine your next step If you replaced or installed a PCIe ca...

Page 181: ... Oracle Solaris cfgadm command cfgadm c connect Ap id where Ap_id is the ID of the card that you want to connect Related Information Understanding PCIe Root Complex Connections on page 165 PCIe Card Configuration on page 168 PCIe Carrier Handle and LEDs on page 169 Determine Which PCIe Card Is Faulty on page 170 Remove a PCIe Card Carrier on page 171 Remove a PCIe Card on page 174 Install a PCIe C...

Page 182: ...Description ______________________________________________________________________________ PCIE1 EMPTY PCIe Native PCIE7 ENABLED PCIe Native Device Usage ___________________________________________________________________________ SUNW qlc 0 fp disk fp 0 0 SUNW qlc 0 1 fp disk fp 0 0 PCIE13 EMPTY PCIe Native PCIE15 EMPTY PCIe Native Related Information Understanding PCIe Root Complex Connections on...

Page 183: ...sconnect Power Cords on page 53 These topics describe service procedures for the rear I O module in the server Rear I O Module LEDs on page 183 Determine if the Rear I O Module Is Faulty on page 186 Remove the Rear I O Module on page 186 Install the Rear I O Module on page 188 Verify the Rear I O Module on page 190 Rear I O Module LEDs The LEDs on the rear I O module give server status information...

Page 184: ...owing conditions Blinking A link is established Off No link is established 4 NET link and activity green Indicates the following conditions On A link is established Blinking Transfer activity is present on the link Off No link is established 5 NET speed amber green Indicates the following conditions Green on The link is operating as a 100 Mbps connection Off There is no link 6 AC1 connector LED am...

Page 185: ...red on and is running in its normal operating state No service actions are required Fast blink Server is running in standby mode and can be quickly returned to full function Slow blink A normal but transitory activity is taking place Slow blinking might indicate that server diagnostics are running or the server is booting 10 Service Processor LED SP Indicates the following conditions Off The AC po...

Page 186: ...rmation Rear I O Module LEDs on page 183 Remove the Rear I O Module on page 186 Install the Rear I O Module on page 188 Verify the Rear I O Module on page 190 Remove the Rear I O Module The rear I O module is a cold service component that can be replaced by a customer 1 Take the necessary ESD precautions See Prevent ESD Damage on page 49 2 Locate the failed rear I O module See Rear Panel Component...

Page 187: ...abel the cables connected to the ports on the rear I O module and then disconnect the cables from the ports You will reconnect the cables to the same ports on the replacement rear I O module 6 Press the green buttons on the rear I O module ejection levers and spread the levers open to eject the rear I O module Servicing the Rear I O Module 187 ...

Page 188: ... on page 43 Rear I O Module LEDs on page 183 Determine if the Rear I O Module Is Faulty on page 186 Install the Rear I O Module on page 188 Verify the Rear I O Module on page 190 Install the Rear I O Module 1 Take the necessary ESD precautions See Prevent ESD Damage on page 49 188 SPARC T7 4 Server Service Manual May 2017 ...

Page 189: ...Install the Rear I O Module 2 With the levers in the extended position insert the rear I O module into the slot at the rear of the server Servicing the Rear I O Module 189 ...

Page 190: ...rver See Returning the Server to Operation on page 201 7 Verify the rear I O installation See Verify the Rear I O Module on page 190 Related Information Rear I O Module LEDs on page 183 Determine if the Rear I O Module Is Faulty on page 186 Remove the Rear I O Module on page 186 Verify the Rear I O Module on page 190 Returning the Server to Operation on page 201 Verify the Rear I O Module 1 Ensure...

Page 191: ...mt shell Are you sure you want to start the faultmgmt shell y n y faultmgmtsp 5 Use the fmadm faulty command to determine if the server is operating normally If a fault was detected see Diagnostics Process on page 26 If no faults were detected then the rear I O module has been replaced successfully No further action is required Related Information Detecting and Managing Faults on page 25 Rear I O ...

Page 192: ...192 SPARC T7 4 Server Service Manual May 2017 ...

Page 193: ...omponent See Disconnect Power Cords on page 53 Rear Chassis Subassembly Components on page 193 Remove the Rear Chassis Subassembly on page 194 Install the Rear Chassis Subassembly on page 197 Verify the Rear Chassis Subassembly on page 198 Related Information Identifying Components on page 13 Detecting and Managing Faults on page 25 Preparing for Service on page 43 Returning the Server to Operatio...

Page 194: ...Rear chassis subassembly chassis Related Information Remove the Rear Chassis Subassembly on page 194 Install the Rear Chassis Subassembly on page 197 Remove the Rear Chassis Subassembly 1 Verify that the rear chassis subassembly needs to be replaced 194 SPARC T7 4 Server Service Manual May 2017 ...

Page 195: ...llers see Remove a PCIe Card Carrier on page 171 Make note of the slots for each carrier or filler panel so that you can install them into the same slots Rear I O module see Remove the Rear I O Module on page 186 You will install these components into the replacement rear chassis subassembly once you have replaced the faulty subassembly 5 Go to the front of the server and remove the following comp...

Page 196: ...assembly 7 Using a Phillips screwdriver loosen the five screws that secure the rear chassis subassembly to the system chassis 8 Slide the rear chassis subassembly out and away from the server Related Information Install the Rear Chassis Subassembly on page 197 196 SPARC T7 4 Server Service Manual May 2017 ...

Page 197: ...tighten the eight green screws that secure the rear chassis subassembly in the server Tighten the screws in the following order a Lower right screw b Upper left screw c Upper right screw d Lower left screw 3 Remove the connector covers from the replacement rear chassis subassembly 4 Install the following components Servicing the Rear Chassis Subassembly 197 ...

Page 198: ...ook when removing the cards from the slots earlier All five fan modules see Install a Fan Module on page 161 6 Connect the power cords See Connect Power Cords on page 201 7 Power on the server See Returning the Server to Operation on page 201 8 Verify the rear chassis subassembly See Verify the Rear Chassis Subassembly on page 198 Related Information Remove the Rear Chassis Subassembly on page 194...

Page 199: ...ing normally If a fault was detected see Diagnostics Process on page 26 If no faults were detected then the rear chassis subassembly has been replaced successfully No further action is required Related Information Detecting and Managing Faults on page 25 Rear I O Module LEDs on page 183 Remove the Rear Chassis Subassembly on page 194 Install the Rear Chassis Subassembly on page 197 Servicing the R...

Page 200: ...200 SPARC T7 4 Server Service Manual May 2017 ...

Page 201: ... Service Task Reference on page 22 Related Information Identifying Components on page 13 Detecting and Managing Faults on page 25 Preparing for Service on page 43 Connect Power Cords Note Standby power is applied as soon as the power cords are connected Depending on how the firmware is configured the server might boot automatically 1 Locate the AC connectors on the rear of the server See Rear Pane...

Page 202: ...nd 1 Check the server power state Type show System power_state System Properties power_state Off 2 If the server is powered off power on the server Type start System Starting System 3 Optional To view server boot output start a host console stream Type start HOST console 4 If you are adding a second processor module return to Server Upgrade Process on page 56 Related Information Connect Power Cord...

Page 203: ... reference AC power connectors 147 DIMMs 70 hard drives 87 power supplies 147 configuring how POST runs 38 customer replaceable components CRUs 46 D DIMMs addresses 73 configuration errors 72 configuration reference 70 fault handling 74 identifying 71 installing 80 locating 19 locating faulty using DIMM Fault Remind button 76 using Oracle ILOM 74 using PSH 75 NAC names 73 rank classification 71 re...

Page 204: ...fan modules 161 front I O assembly 145 hard drives 94 main module 105 NVMe switch cards 114 PCIe cards 177 PCIe carriers 180 power supplies 153 processor modules 64 rear chassis subassembly 197 rear I O module 188 SCC PROM 134 SPM 128 K Knowledge Base 25 Knowledge Base articles 33 L LEDs AC power connectors 150 front panel 29 hard drives 89 NET Link and Activity 31 Net Management Link and Activity...

Page 205: ...vity LED 31 Net Management Link and Activity LED 31 Net Management Speed LED 31 NET Speed LED 31 NVMe switch cards installing 114 removing 113 servicing 111 verifying 117 O Oracle Solaris OS files and commands 35 Oracle Solaris PSH checking for faults 33 clearing faults 40 memory faults 74 overview 25 Oracle VTS 26 P PCIe cards installation order 168 installing 177 locating faulty 170 removing 174...

Page 206: ...finding faulty 186 installing 188 LEDs 31 183 locating 16 removing 186 verifying 190 198 removing DIMMs 78 drive backplanes 119 fan modules 159 front I O assembly 143 hard drives 90 main module 101 NVMe switch cards 113 PCIe cards 174 PCIe carriers 171 power supplies 151 processor modules 60 rear chassis subassembly 194 rear I O module 186 SCC PROM 133 SPM 126 S safety information and symbols 43 S...

Page 207: ...m Power button 29 System Power OK LED 29 31 system schematic 41 System Service Required LED 29 31 T tools needed for service 45 U upgrading the server 56 using Oracle VTS 26 UUID 33 V var adm messages file 36 verifying battery 140 DIMMs 83 fan modules 162 hard drives 95 main module 108 NVMe switch cards 117 PCIe cards 181 power supplies 156 processor modules 67 rear I O module 190 198 SCC PROM 136...

Page 208: ...208 SPARC T7 4 Server Service Manual May 2017 ...

Reviews: