background image

SPARC T3-2 Server

Service Manual

Part No.: E20793-05
December 2013

Summary of Contents for SPARC T3-2

Page 1: ...SPARC T3 2 Server Service Manual Part No E20793 05 December 2013 ...

Page 2: ...s sont concédés sous licence et soumis à des restrictions d utilisation et de divulgation Sauf disposition de votre contrat de licence ou de la loi vous ne pouvez pas copier reproduire traduire diffuser modifier breveter transmettre distribuer exposer exécuter publier ou afficher le logiciel même partiellement sous quelque forme et par quelque procédé que ce soit Par ailleurs il est interdit de pr...

Page 3: ... Components 8 I O Components 10 Power Distribution and Fan Module Components 12 Detecting and Managing Faults 15 Diagnostics Overview 15 Diagnostics Process 16 Interpreting Diagnostic LEDs 19 Front and Rear Panel System Controls and LEDs 20 Ethernet and Network Management Port LEDs 21 Managing Faults With ILOM 22 ILOM Troubleshooting Overview 22 Access the Service Processor Oracle ILOM 24 Display ...

Page 4: ...eting Log Files and System Messages 37 Check the Message Buffer 37 View the System Message Log Files 37 Using the Oracle Solaris Predictive Self Healing Feature 38 Oracle Solaris Predictive Self Healing Technology Overview 39 PSH Detected Fault Example 40 Check for PSH Detected Faults 40 Clear PSH Detected Faults 42 Running POST 45 POST Overview 45 ILOM Properties that Affect POST Behavior 46 Conf...

Page 5: ... Serial Number 64 Locate the Server 65 Understanding Component Replacement Categories 66 Hot Service Replaceable by Customer 66 Cold Service Replaceable by Customer 67 Cold Service Replaceable by Authorized Service Personnel 67 Shut Down the Server 68 Removing Power From the System 69 Power Off the Server Service Processor Command 69 Power Off the Server Power Button Standby Mode 70 Power Off the ...

Page 6: ...ew 82 Locate a Faulty Hard Disk Drive 83 Remove a Hard Disk Drive Filler Panel 84 Remove a Hard Disk Drive 86 Install a Hard Disk Drive 88 Install a Hard Disk Drive Filler Panel 90 Verify Hard Disk Drive Functionality 91 Servicing Fan Modules 93 Fan Module Overview 94 Locate a Faulty Fan Module 95 Remove a Fan Module 96 Install a Fan Module 98 Verify Fan Module Functionality 99 Servicing Power Sup...

Page 7: ...IMM Fault Remind Button 117 Locate a Faulty DIMM show faulty Command 120 Remove a Memory Riser 120 Install a Memory Riser 121 Remove a DIMM 122 Install a DIMM 124 Increase Server Memory With Additional DIMMs 126 Remove a Memory Riser Filler Panel 128 Install a Memory Riser Filler Panel 129 Verify DIMM Functionality 130 DIMM Configuration Error Messages 132 DIMM Configuration Errors System Console ...

Page 8: ...d Filler Panel 148 Remove a PCIe Card 150 Install a PCIe Card 151 Cable an Internal SAS HBA PCIe Card 153 Install a PCIe Card Filler Panel 155 Servicing the Fan Board 157 Remove the Fan Board 157 Install the Fan Board 159 Verify Fan Board Functionality 161 Servicing the Motherboard 163 Motherboard Overview 164 Remove the Motherboard 164 Install the Motherboard 167 Reactivate RAID Volumes 170 Verif...

Page 9: ... Verify Hard Disk Drive Backplane Functionality 187 Servicing the Power Supply Backplane 189 Remove the Power Supply Backplane 189 Install the Power Supply Backplane 191 Verify Power Supply Backplane Functionality 193 Returning the Server to Operation 195 Install the Top Cover 196 Return the Server to the Normal Rack Position 197 Connect Power Cords to the Server 198 Power On the Server 198 Glossa...

Page 10: ...x SPARC T3 2 Server Service Manual December 2013 ...

Page 11: ...ing hardware Related Documentation on page xi Feedback on page xii Support and Accessibility on page xii Related Documentation Documentation Links All Oracle products http www oracle com documentation SPARC T3 2 server http www oracle com pls topic lookup ctx E19166 01 Oracle ILOM 3 0 http www oracle com pls topic lookup ctx ilom30 Oracle Solaris OS and other systems software http www oracle com t...

Page 12: ...e com goto docfeedback Support and Accessibility Description Links Access electronic support through My Oracle Support http support oracle com For hearing impaired http www oracle com accessibility support html Learn about Oracle s commitment to accessibility http www oracle com us corporate accessibility index html ...

Page 13: ...System Cables on page 5 Internal Components on page 5 Exploring Parts of the System on page 7 Related Information Detecting and Managing Faults on page 15 Preparing for Service on page 61 Front Components The following figure shows the layout of the server front panel including the power and system locator buttons and the various status and fault LEDs Note The front panel also provides access to i...

Page 14: ... SATA DVD drive optional 3 Power OK LED green 12 Hard disk drive 0 optional 4 Power button 13 Hard disk drive 1 optional 5 SP OK Fault LED green amber 14 Hard disk drive 2 optional 6 Service Action Required LEDs 3 for Fan Module FAN Processor CPU and Memory amber 15 Hard disk drive 3 optional 7 Power Supply PS Fault Service Action Required LED amber 16 Hard disk drive 4 optional 8 Over Temperature...

Page 15: ...r DC OK green AC OK green or amber 8 Network NET 10 100 1000 ports NET0 NET3 2 Power supply unit 0 AC inlet 9 USB 2 0 connectors 2 3 Power supply unit 1 status indicator LEDs Service Action Required amber DC OK green AC OK green or amber 10 PCIe card slots 5 9 4 Power supply unit 1 AC inlet 11 DB 15 video connector 5 System status LEDs Power OK green Attention amber Locate white 12 Serial manageme...

Page 16: ... the motherboard via a bus bar and ribbon cable and supports a top cover safety interlock kill switch Note When a PDB is replaced the chassis serial number and part number must be programmed by trained service personnel Power supply backplane This board carries 12V power from the power supplies to the PDB over a pair of bus bars The power supplies connect directly to the PDB Fan power board This b...

Page 17: ...connection is broken which causes the server to power down Power supply backplane signal cable 1 ribbon cable This cable carries signals between the power supply backplane and the power distribution board Motherboard signal cable 1 ribbon cable This cable carries signals between the power distribution board and the motherboard Hard drive data cables 2 bundled These cables carry data and control si...

Page 18: ...cations Item Component Item Component 1 Motherboard 7 System lithium battery 2 Low profile PCIe cards 8 Memory risers 3 Power supplies 9 Fan board 4 Power supply backplane and cover 10 Fan modules 5 Hard disk drive backplane 11 DVD drive 6 Service processor 12 Hard disk drives HDDs ...

Page 19: ...e components that can be individually replaced These components are grouped into three functional categories Motherboard Components on page 8 I O Components on page 10 Power Distribution and Fan Module Components on page 12 Related Information Front Components on page 1 Rear Components on page 3 ...

Page 20: ...oard Components The following figure illustrates replaceable components related to the motherboard FIGURE Exploded View of Motherboard Components The following table identifies the components located on the motherboard and includes FRU names where applicable ...

Page 21: ...ove this to service DIMMs SYS MB CMP0 MR0 SYS MB CMP0 MR1 SYS MB CMP1 MR0 SYS MB CMP1 MR1 2 Service processor Back panel PCI cross beam must be removed to access risers SYS MB SP 3 DIMMs See configuration rules before upgrading DIMMs SYS MP CMPn MRn BOBn CHn Dn 4 Memory riser filler panel Must be installed in blank memory riser slots N A 5 Motherboard assembly Must be removed to access power distr...

Page 22: ...GURE Exploded View of I O Components The following table identifies the I O components in the server and FRU names where applicable TABLE I O Components Item FRU Notes FRU Name If Applicable 1 Hard disk drives Must be removed to service the hard drive backplane SYS SASBP HDD0 SYS SASBP HDD1 SYS SASBP HDD2 SYS SASBP HDD3 SYS SASBP HDD4 SYS SASBP HDD5 ...

Page 23: ...ng DVD Drives on page 139 Servicing the Hard Disk Drive Backplane on page 183 2 Front control panel light pipe assembly Metal light pipe bracket is not a FRU N A 3 DVD module USB module Must be removed to service the hard drive backplane SYS DVD SYS USBBD 4 Hard drive backplane SYS SASBP TABLE I O Components Continued Item FRU Notes FRU Name If Applicable ...

Page 24: ... The following figure illustrates replaceable components related to power distribution and the fan modules FIGURE Exploded View of Power Distribution Fan Module Components The following table identifies the power distribution and fan module components in the server and FRU names where applicable ...

Page 25: ...ply backplane and cover The power supplies must be partially removed from the chassis to remove this component N A 2 Air duct A plastic cover that rests on top of the CPUs N A 3 Power supplies Two power supplies provide N 1 redundancy SYS PS0 SYS PS1 4 Fan modules All six fan modules must be installed in the server SYS FANBD0 FAN0 SYS FANBD0 FAN1 SYS FANBD0 FAN2 SYS FANBD0 FAN3 SYS FANBD0 FAN4 SYS...

Page 26: ...14 SPARC T3 2 Server Service Manual December 2013 ...

Page 27: ...Overview You can use a variety of diagnostic tools commands and indicators to monitor and troubleshoot a server LEDs Provide a quick visual notification of the status of the server and of some of the field replaceable units FRUs Oracle ILOM This firmware runs on the service processor In addition to providing the interface between the hardware and OS ILOM also tracks and reports the health of key s...

Page 28: ...n and discloses possible faulty components with recommendations for repair The LEDs ILOM PSH and many of the log files and console messages are integrated For example when the Solaris software detects a fault it displays the fault logs it and passes information to ILOM where it is logged Depending on the fault one or more LEDs might also be illuminated The diagnostic flow chart in Diagnostics Proc...

Page 29: ...aults 17 FIGURE Diagnostic Flowchart The following table provides brief descriptions of the troubleshooting actions shown in the flowchart It also provides links to topics with additional information on each diagnostic action ...

Page 30: ...er and log files record system events and provide information about faults If system messages indicate a faulty device replace the FRU For more diagnostic information review the SunVTS report Flowchart item 4 Interpreting Log Files and System Messages on page 37 Run SunVTS software Flowchart item 4 SunVTS is an application you can run to exercise and diagnose FRUs To run SunVTS the server must be ...

Page 31: ...d reports faulty FRUs When POST detects a faulty FRU it logs the fault and if possible takes the FRU offline POST detected FRUs display the following text in the fault message Forced fail reason In a POST fault message reason is the name of the power on routine that detected the failure Running POST on page 45 Clear POST Detected Faults on page 51 Contact technical support Flowchart item 9 The maj...

Page 32: ...LED Power OK LED green Indicates the following conditions Off System is not running in its normal state System power might be off The service processor might be running Steady on System is powered on and is running in its normal operating state No service actions are required Fast blink System is running in standby mode and can be quickly returned to full function Slow blink A normal but transitor...

Page 33: ...action is required Steady on Indicates that a power supply failure event has been acknowledged and a service action is required on at least one PSU Overtemp LED amber Provides the following operational temperature indications Off Indicates a steady state no service action is required Steady on Indicates that a temperature failure event has been acknowledged and a service action is required TABLE E...

Page 34: ... on page 28 Clear Faults With the clear_fault_action Property on page 30 Related Information POST Overview on page 45 ILOM Properties that Affect POST Behavior on page 46 ILOM Troubleshooting Overview The Integrated Lights Out Manager Oracle ILOM firmware enables you to remotely run diagnostics such as power on self test POST that would otherwise require physical proximity to the server s serial p...

Page 35: ...prove over time They may also be caused by a configuration error such as the wrong DIMM type being installed If the conditions responsible for the alert go away the fault manager will detect the change and will stop logging alerts for that condition Faults When the fault manager determines that a particular FRU has an error condition that is permanent that error is classified as a fault This cause...

Page 36: ...re are two approaches to interacting with the service processor Oracle ILOM shell default The ILOM shell provides access to ILOM s features and functions through a command line interface Oracle ILOM browser interface The ILOM browser interface supports the same set of features and functions as the shell but through windows on a browser interface Note Unless indicated otherwise all examples of inte...

Page 37: ...e The default login account is root with a password of changeme Oracle ILOM web interface Can be used when you access the service processor through the NET MGT port and have a browser Refer to the ILOM 3 0 documentation for details This interface is not referenced in this service manual 3 Log in to Oracle ILOM The default Oracle ILOM login account is root with a default password changeme Example o...

Page 38: ...SH detected faults See Check for Faults Using the show faulty Command on page 28 clear_fault_action property of the set command Manually clears PSH detected faults See Clear Faults With the clear_fault_action Property on page 30 Note You can use fmadm faulty in the faultmgmt shell as an alternative to show faulty ...

Page 39: ... Related Information Diagnostics Process on page 16 Clear Faults With the clear_fault_action Property on page 30 show SYS MB CMP0 MRO B0B0 CH0 D0 SYS MB CMP0 MRO B0B0 CH0 D0 Targets T_AMB SERVICE Properties Type DIMM ipmi_name PO MO B0 C0 D0 component_state Enabled fru_name 2048MB DDR3 SDRAM fru_description DDR3 DIMM 2048 Mbytes fru_manufacturer Samsung fru_version 0 fru_part_number fru_serial_num...

Page 40: ...rocess on page 16 Clear Faults With the clear_fault_action Property on page 30 Check for Faults With the fmadm faulty Command on page 29 show faulty Target Property Value SP faultmgmt 0 fru SYS PS0 SP faultmgmt 0 class fault chassis power volt fail faults 0 SP faultmgmt 0 sunw msg id SPT 8000 LC faults 0 SP faultmgmt 0 uuid faults 0 SP faultmgmt 0 timestamp 2010 08 11 14 54 23 faults 0 SP faultmgm...

Page 41: ...and Related Information Diagnostics Process on page 16 start SP faultmgmt shell Are you sure you want to start SP faultmgmt shell y n y faultmgmtsp fmadm faulty Time UUID msgid Severity 2010 08 11 14 54 23 SPT 8000 LC Critical Fault class fault chassis power volt fail Description A Power Supply voltage level has exceeded acceptible limits Response The service required LED on the chassis and on the...

Page 42: ...earing will typically not be required Note For PSH detected faults this procedure clears the fault from the service processor but not from the host If the fault persists in the host clear it manually as described in Clear PSH Detected Faults on page 42 At the prompt use the set command with the clear_fault_action True property This example begins with an excerpt from the fmadm faulty command showi...

Page 43: ... y n y show SYS PS0 Targets VINOK PWROK CUR_FAULT VOLT_FAULT FAN_FAULT TEMP_FAULT V_IN I_IN V_OUT I_OUT INPUT_POWER OUTPUT_POWER Properties type Power Supply ipmi_name PSO fru_name SYS PSO fru_description Powersupply fru_manufacturer Delta Electronics fru_version 03 fru_part_number fru_serial_number fault_state OK clear_fault_action none Commands cd set show ...

Page 44: ...f the message ID indicate that the fault was detected by Oracle ILOM show faulty Target Property Value show faulty Target Property Value SP faultmgmt 0 fru SYS PS0 SP faultmgmt 0 class fault chassis power volt fail faults 0 SP faultmgmt 0 sunw msg id SPT 8000 LC faults 0 SP faultmgmt 0 uuid faults 0 SP faultmgmt 0 timestamp 2010 08 11 14 54 23 faults 0 SP faultmgmt 0 fru_part_number faults 0 SP fa...

Page 45: ...l Are you sure you want to start SP faultmgmt shell y n y faultmgmtsp fmadm faulty Time UUID msgid Severity 2010 08 11 14 54 23 SPT 8000 LC Critical Fault class fault chassis power volt fail Description A Power Supply voltage level has exceeded acceptible limits Response The service required LED on the chassis and on the affected Power Supply may be illuminated Impact Server will be powered down w...

Page 46: ... at the beginning of the message ID show faulty Target Property Value SP faultmgmt 0 fru SYS MB SP faultmgmt 0 class fault component disabled faults 0 SP faultmgmt 0 sunw msg id SPT 8000 HR faults 0 SP faultmgmt 0 uuid faults 0 a262 SP faultmgmt 0 timestamp 2010 09 03 11 21 17 faults 0 SP faultmgmt 0 detector SYS MB CMP0 NIU1 faults 0 SP faultmgmt 0 fru_part_number 541 3857 04 faults 0 SP faultmgm...

Page 47: ...vice Related ILOM Commands ILOM Command Description help command Displays a list of all available commands with syntax and descriptions Specifying a command name as an option displays help for that command set HOST send_break_action break Takes the host server from the OS to either kmdb or break menu depending on the mode Solaris software was booted set fru component clear_fault_action true Clears...

Page 48: ... Locator LED on the server on or off show faulty Displays current system faults See Check for Faults Using the show faulty Command on page 28 show SYS keyswitch_state Displays the status of the virtual keyswitch show SYS LOCATE Displays the current state of the Locator LED as either fast blink or off show SP logs event list Displays the history of all events logged in the service processor event l...

Page 49: ...he dmesg command to view the most recent system message To view the system messages log file view the contents of the var adm messages file Check the Message Buffer on page 37 View the System Message Log Files on page 37 Check the Message Buffer The dmesg command checks the system buffer for recent diagnostic messages and displays them 1 Log in as superuser 2 Type Related Information View the Syst...

Page 50: ...ges are further rotated to messages 2 and messages 3 and then deleted 1 Log in as superuser 2 Type 3 If you want to view all logged messages type Related Information Check the Message Buffer on page 37 Using the Oracle Solaris Predictive Self Healing Feature The following topics describe the Solaris Predictive Self Healing feature Oracle Solaris Predictive Self Healing Technology Overview on page ...

Page 51: ...ms When possible the Fault Manager daemon initiates steps to self heal the failed component and take the component offline The daemon also logs the fault to the syslogd daemon and provides a fault notification with a message ID MSGID You can use the message ID to get additional information about the problem from the knowledge article database The PSH technology covers the following server componen...

Page 52: ... the Oracle ILOM fmadm shell As an alternative you could display fault information by running the Oracle ILOM command show SUNW MSG ID SUN4V 8000 DX TYPE Fault VER 1 SEVERITY Minor EVENT TIME Wed Jun 17 10 09 46 EDT 2009 PLATFORM SUNW system_name CSN HOSTNAME server48 37 SOURCE cpumem diagnosis REV 1 5 EVENT ID f92e9fbe 735e c218 cf87 9e1720a28004 DESC The number of errors associated with this mem...

Page 53: ... FRU SYS MB for motherboard in this example fmadm faulty TIME EVENT ID MSG ID SEVERITY Aug 13 11 48 33 SUN4V 8002 6E Major Platform sun4v Chassis_id Product_sn Fault class fault cpu generic sparc strand Affects cpu cpuid serial faulted and taken out of service FRU SYS MB hc product id product sn server id chassis id serial revision 05 chassis 0 motherboard 0 faulty Description The number of correc...

Page 54: ...cle Solaris PSH facility detects faults the faults are logged and displayed on the console In most cases after the fault is repaired the corrected state is detected by the system and the fault condition is repaired automatically However this repair should be verified In cases where the fault condition is not automatically cleared the fault must be cleared manually Correctable strand errors exceede...

Page 55: ..._id Product_sn Fault class fault cpu generic sparc strand Affects cpu cpuid serial faulted and taken out of service FRU SYS MB hc product id product sn server id chassis id serial revision 05 chassis 0 motherboard 0 faulty Description The number of correctable errors associated with this strand has exceeded acceptable levels Refer to http sun com msg SUN4V 8002 6E for more information Response The...

Page 56: ...omponent For example use this command if you have reinstalled a card after straightening a bent pin and reseating the card Use fmadm replaced when you install a new component to replace a faulted component and the new component is not automatically discovered by the system Note The system generally detects new components when a new serial number is introduced into the system but in cases where fau...

Page 57: ...he parameter keyswitch_state to diag You can also set other Oracle ILOM properties to control various other aspects of POST operations For example you can specify the events that cause POST to run the level of testing POST performs and the amount of diagnostic information POST displays These properties are listed and described in ILOM Properties that Affect POST Behavior on page 46 If POST detects...

Page 58: ...er on and run POST but no flash updates can be made HOST diag mode off POST does not run normal Runs POST according to diag_level value service Runs POST with preset values for diag_level and diag_verbosity HOST diag level max If diag_mode normal runs all the minimum tests plus extensive processor and memory tests min If diag_mode normal runs minimum set of tests HOST diag trigger none Does not ru...

Page 59: ... ILOM set command variables FIGURE Flowchart of ILOM Properties Used to Manage POST Operations max POST displays all test and informational messages debug MAX POST plus debugging messages none No POST output is displayed TABLE ILOM Properties Used to Manage POST Operations Parameter Values Description ...

Page 60: ...T Behavior on page 46 3 If the virtual keyswitch is set to normal and you want to define the mode level verbosity or trigger set the respective parameters Syntax set HOST diag property value See ILOM Properties that Affect POST Behavior on page 46 for a list of parameters and values Examples 4 To see the current values for settings use the show command Example set SYS keyswitch_state normal Set ke...

Page 61: ...of POST 1 Access the ILOM prompt See Access the Service Processor Oracle ILOM on page 24 2 Set the virtual keyswitch to diag so that POST will run in service mode 3 Reset the system so that POST runs There are several ways to initiate a reset The following example shows a reset by issuing commands that will power cycle the host trigger none verbosity min Commands cd set show set SYS keyswitch_stat...

Page 62: ...page 47 Configure How POST Runs on page 48 Interpret POST Fault Messages on page 50 Clear POST Detected Faults on page 51 Interpret POST Fault Messages 1 Run POST See Run POST With Maximum Testing on page 49 2 View the output and watch for messages that look similar to the following syntax descriptions and example POST error messages use the following syntax n c s ERROR TEST failing test n c s H W...

Page 63: ...ulty component POST logs the fault and automatically takes the failed component out of operation by placing the component in the ASR blacklist see Managing Components With Automatic System Recovery Commands on page 54 Usually when a faulty component is replaced the replacement is detected when the service processor is reset or power cycled and the fault is automatically cleared from the system 1 A...

Page 64: ...vice required LED is no longer on 4 Reset the server You must reboot the server for the component_state property to take effect 5 At the ILOM prompt use the show faulty command to verify that no faults are reported For example Related Information POST Overview on page 45 ILOM Properties that Affect POST Behavior on page 46 Configure How POST Runs on page 48 Run POST With Maximum Testing on page 49...

Page 65: ...MCU1HCCE MCU1 issued a Hardware Corrected and Cleared Error Request 2010 07 03 18 44 14 248 0 7 2 2010 07 03 18 44 14 296 0 7 2 Decode of Mem Error Status Reg Branch 1 bits 33044000 00000000 2010 07 03 18 44 14 427 0 7 2 1 MEU 61 R W1C Set to 1 on an UE if VEU 1 or VEF 1 or higher priority error in same cycle 2010 07 03 18 44 14 614 0 7 2 1 MEC 60 R W1C Set to 1 on a CE if VEC 1 or VEU 1 or VEF 1 ...

Page 66: ...ror Retry Reg for Branch 1 00000000 00000004 2010 07 03 18 44 16 086 0 7 2 DRAM Error RetrySyndrome 1 Reg for Branch 1 a8a5f81e f6411b5a 2010 07 03 18 44 16 218 0 7 2 DRAM Error Retry Syndrome 2 Reg for Branch 1 a8a5f81e f6411b5a 2010 07 03 18 44 16 351 0 7 2 DRAM Failover Location 0 for Branch 1 00000000 00000000 2010 07 03 18 44 16 475 0 7 2 DRAM Failover Location 1 for Branch 1 00000000 0000000...

Page 67: ...add or remove components asrkeys from the ASR blacklist You run these commands from the ILOM prompt Note The asrkeys vary from system to system depending on how many cores and memory are present Use the show components command to see the asrkeys on a given system After you enable or disable a component you must reset or power cycle the system for the component s change of state to take effect Rela...

Page 68: ...mponent_state property to Disabled This adds the component to the ASR blacklist 1 At the prompt set the component_state property to Disabled show components Target Property Value SYS MB RISER0 component_state Enabled PCIE0 SYS MB RISER0 component_state Disabled PCIE3 SYS MB RISER1 component_state Enabled PCIE1 SYS MB RISER1 component_state Enabled PCIE4 SYS MB RISER2 component_state Enabled PCIE2 ...

Page 69: ...ent_state property to Enabled This removes the component from the ASR blacklist 1 At the prompt set the component_state property to Enabled 2 Reset the server so that the ASR command takes effect Note In the ILOM shell there is no notification when the system is actually powered off Powering off takes about a minute Use the show HOST command to determine if the host has powered off stop SYS Are yo...

Page 70: ...nVTS is a validation test suite that you can use to test this server SunVTS provides multiple diagnostic hardware tests that verify the connectivity and functionality of most hardware controllers and devices for this server SunVTS provides these kinds of test categories Audio Communication serial and parallel Graphic and video Memory Network Peripherals hard disk drives CD DVD devices and printers...

Page 71: ... the presence of SunVTS packages using the pkginfo command If information about the packages is displayed then SunVTS software is installed If you receive messages reporting ERROR information for package was not found then SunVTS is not installed You must take action to install the software before you can use it You can obtain the SunVTS software from the following places Solaris OS media kit DVDs...

Page 72: ...60 SPARC T3 2 Server Service Manual December 2013 ...

Page 73: ...n page 64 Locate the Server on page 65 Understanding Component Replacement Categories on page 66 Shut Down the Server on page 68 Removing Power From the System on page 69 Positioning the System for Servicing on page 72 Accessing Internal Components on page 76 Filler Panels on page 78 Attaching Devices to the Server on page 78 Related Information Identifying Server Components on page 1 ...

Page 74: ... on the equipment and described in the SPARC T3 2 Safety and Compliance Guide Ensure that the voltage and frequency of your power source match the voltage and frequency inscribed on the equipment s electrical rating label Follow the electrostatic discharge safety practices as described in this section Safety Symbols Note the meanings of the following symbols that might appear in this document Caut...

Page 75: ... supplies before servicing any of the components that are inside the chassis Antistatic Wrist Strap Use Wear an antistatic wrist strap and use an antistatic mat when handling components such as hard drive assemblies circuit boards or PCI cards When servicing or removing server components attach an antistatic strap to your wrist and then to a metal area on the chassis Following this practice equali...

Page 76: ...ssis Serial Number If you require technical support for your system you will be asked to provide the server s chassis serial number You can find the chassis serial number on a sticker located on the front of the server and on another sticker on the side of the server If it is not convenient to read either sticker you can run the ILOM show SYS command to obtain the chassis serial number Type show S...

Page 77: ...many other servers 1 At the ILOM command line type The white Locator LEDs one on the front panel and one on the rear panel blink 2 After locating the server with the blinking Locator LED turn it off by pressing the Locator button Note Alternatively you can turn off the Locator LED by running the ILOM set SYS LOCATE value off command Related Information Front Components on page 1 product_part_numbe...

Page 78: ...components can be replaced by customers Although hot service procedures can be performed while the server is running you should usually bring it to standby mode as the first step in the replacement procedure Refer to Power Off the Server Power Button Standby Mode on page 70 for instructions Component Service information Notes Hard disk drive HDD Solid state drive SSD Servicing Hard Disk Drives on ...

Page 79: ...d See Cold Service Replaceable by Customer on page 67 for the steps involved in shutting down the server Component Service information Notes DIMMs Servicing Memory Risers and DIMMs on page 109 DVD drive filler Servicing DVD Drives on page 139 Remove any media prior to replacement Must be installed to preserve proper interior air flow System battery Servicing the System Lithium Battery on page 143 ...

Page 80: ...nning programs Refer to your application documentation for specific information on these processes 4 Shut down all logical domains Refer to the Oracle Solaris system administration documentation for additional information 5 Shut down the Oracle Solaris OS Refer to the Oracle Solaris system administration documentation for additional information 6 Switch from the system console to the prompt by typ...

Page 81: ...iew server status or log files You also might want to run diagnostics before you shut down the server 2 Notify affected users that the server will be shut down Refer to your Solaris system administration documentation for additional information 3 Save any open files and quit all running programs Refer to your application documentation for specific information on these processes Description Links P...

Page 82: ...the Server Power Button Standby Mode on page 70 This button is recessed to prevent accidental server power off Use the tip of a pen to operate this button Related Information Power Off the Server Power Button Standby Mode on page 70 Power Off the Server Emergency Shutdown on page 71 Front Components on page 1 Power Off the Server Power Button Standby Mode This procedure places the server in the po...

Page 83: ... Components on page 1 Remove Power From the Server Remove the power cords from the server only after powering of the server Unplug all power cords from the server Caution Because 3 3v standby power is always present in the system you must unplug the power cords before accessing any cold serviceable components Related Information Power Off the Server Service Processor Command on page 69 Power Off t...

Page 84: ...Position The following components can be serviced with the server in the maintenance position Hard disk drives Fan modules Power supplies DVD module Fan boards DIMMs PCIe XAUI cards System battery 1 Verify that no cables will be damaged or will interfere when the server is extended Although the cable management arm CMA that is supplied with the server is hinged to accommodate extending the server ...

Page 85: ...e the CMA For some service procedures if you are using a cable management arm CMA you might need to release the CMA to gain access to the back of the chassis Note For instructions on how to install the CMA for the first time refer to SPARC T3 2 Installation Guide 1 Press and hold the tab 2 Swing the CMA out of the way When you have finished with the service procedure swing the CMA closed and latch...

Page 86: ...74 SPARC T3 2 Server Service Manual December 2013 Related Information Extend the Server to the Maintenance Position on page 72 Remove the Server From the Rack on page 75 ...

Page 87: ...cords from the server 4 Extend the server to the maintenance position See Extend the Server to the Maintenance Position on page 72 5 Release the CMA from the rail assembly The CMA is still attached to the cabinet but the server chassis is now disconnected from the CMA See Release the CMA on page 73 6 From the front of the server pull the release tabs forward and pull the server forward until it is...

Page 88: ... an antistatic mat The following items can be used as an antistatic mat Antistatic bag used to wrap a replacement part ESD mat A disposable ESD mat shipped with some replacement parts or optional system components 2 Attach an antistatic wrist strap When servicing or removing server components attach an antistatic strap to your wrist and then to a metal area on the chassis Remove the Server Top Cov...

Page 89: ...eously lift both latches in an upward motion as shown in panel 1 of the following figure 3 Lift the cover slightly and slide it toward the front of the server chassis about 0 5 inch 12 mm 4 Lift up and remove the top cover as shown in panel 2 of the preceding figure Related Information Install the Top Cover on page 196 ...

Page 90: ...filler panel and continue to operate your system with an empty module slot the server might overheat due to improper airflow For instructions on removing or installing a filler panel for a server component refer to the section in this guide about servicing that component Attaching Devices to the Server During service procedures you might have to connect devices to the server The following sections...

Page 91: ...e server s USB connectors and or a monitor to the DB 15 video connector 3 If you plan to connect to the ILOM software over the network connect an Ethernet cable to the Ethernet port labeled NET MGT Note The service processor SP uses the NET MGT out of band port by default You can configure the SP to share one of the sever s four 10 100 1000 Ethernet ports instead The SP uses only the configured Et...

Page 92: ...PARC T3 2 Server Service Manual December 2013 4 If you plan to access the ILOM command line interface CLI using the management port connect a serial null modem cable to the RJ 45 serial port labeled SER MGT ...

Page 93: ...ard Disk Drive Overview on page 82 Locate a Faulty Hard Disk Drive on page 83 Remove a Hard Disk Drive Filler Panel on page 84 Remove a Hard Disk Drive on page 86 Install a Hard Disk Drive on page 88 Install a Hard Disk Drive Filler Panel on page 90 Verify Hard Disk Drive Functionality on page 91 ...

Page 94: ...t first take it offline This prevents applications from accessing the drive and removes software links to it A hard disk drive should not be hot swapped when either of the following conditions exist The hard disk drive contains the sole image of the operating system that is the operating system is not mirrored on another drive The hard disk drive cannot be logically isolated from the server s onli...

Page 95: ...disk drive status LEDs LED Color Description 1 Ready to Remove Blue Indicates that a hard drive can be removed during a hot swap operation 2 Service Required Amber Indicates that the hard drive has experienced a fault condition 3 OK Activity HDDs Green Indicates the hard drive s availability for use On Read or write activity is in progress Off Drive is idle and available for use OK Activity SSDs G...

Page 96: ...k Drive on page 88 Remove a Hard Disk Drive Filler Panel on page 84 Install a Hard Disk Drive Filler Panel on page 90 Verify Hard Disk Drive Functionality on page 91 Remove a Hard Disk Drive Filler Panel This procedure can be performed by customers while the server is running See Hot Service Replaceable by Customer on page 66 for more information about hot service procedures 1 Attach an antistatic...

Page 97: ...p the latch and pull the filler panel out of the drive slot Caution When you remove a hard drive filler panel replace it with another filler panel or an HDD otherwise the server might overheat due to improper airflow Related Information Locate a Faulty Hard Disk Drive on page 83 Remove a Hard Disk Drive on page 86 Install a Hard Disk Drive on page 88 Install a Hard Disk Drive Filler Panel on page ...

Page 98: ...ne a At the Solaris prompt type the cfgadm al command to list all drives in the device tree including drives that are not configured This command lists dynamically reconfigurable hardware resources and shows their operational status In this case look for the status of the drive you plan to remove This information is listed in the Occupant column You must unconfigure any drive whose status is liste...

Page 99: ... described in Removing Power From the System on page 69 To hot swap the drive take the drive offline using one of the procedures in Power Off the Server Power Button Standby Mode on page 70 This removes the logical software links to the drive and prevents any applications from accessing it 5 If you are hot swapping the drive locate the drive that displays the amber Fault LED and ensure that the bl...

Page 100: ...rst install the hard drive into the drive slot and then configure that drive to the server Note If you removed an existing hard drive from a slot in the server you must install the replacement drive in the same slot as the drive that was removed Hard drives are physically addressed according to the slot in which they are installed 1 Unpack the hard disk drive and place it on an antistatic mat 2 Ve...

Page 101: ...ure described in Power On the Server on page 198 If you hot swapped the drive configure it using the cfgadm c configure command The following example shows the drive at c0 dsk c1t1d0 being configured Related Information Locate a Faulty Hard Disk Drive on page 83 Remove a Hard Disk Drive on page 86 Remove a Hard Disk Drive Filler Panel on page 84 Install a Hard Disk Drive Filler Panel on page 90 Ve...

Page 102: ... completing the following tasks a Slide the drive filler panel into the drive slot until it is fully seated b Close the latch to lock the filler panel in place Related Information Locate a Faulty Hard Disk Drive on page 83 Remove a Hard Disk Drive on page 86 Install a Hard Disk Drive on page 88 Remove a Hard Disk Drive Filler Panel on page 84 Verify Hard Disk Drive Functionality on page 91 ...

Page 103: ...evice tree including any drives that are not configured This command helps you identify the drive you installed 3 Configure the drive using the cfgadm c configure command Example Replace c0 sd1 with the drive name for your configuration 4 Verify that the blue Ready to Remove LED is no longer lit on the drive that you installed See Locate a Faulty Hard Disk Drive on page 83 cfgadm al Ap_id Type Rec...

Page 104: ...rive see Diagnostics Process on page 16 If the previous steps indicate that the drive is functioning properly perform the tasks required to configure the drive These tasks are covered in the Solaris OS administration documentation For additional drive verification you can run SunVTS Refer to the SunVTS documentation for details cfgadm al Ap_Id Type Receptacle Occupant Condition c0 scsi bus connect...

Page 105: ...hese topics explain how to service faulty fan modules Fan Module Overview on page 94 Locate a Faulty Fan Module on page 95 Remove a Fan Module on page 96 Install a Fan Module on page 98 Verify Fan Module Functionality on page 99 ...

Page 106: ...hot swappable CRU Caution While the fan modules provide some cooling redundancy if a fan module fails replace it as soon as possible to maintain server availability When you remove one of the fans in the back row you must replace it within 30 seconds to prevent overheating of the server Related Information Understanding Component Replacement Categories on page 66 Remove a Fan Module on page 96 Ins...

Page 107: ...er to Front Components on page 1 Fan Fault LED on or adjacent to the faulty fan module refer to the following illustration Each fan module contains an LED When the amber Service Required LED is lit a fault has occurred on that fan module FIGURE Fan Module LEDs The following table describes the status LEDs located on the fan modules LED Color Status When Lit Power OK Green The system is powered on ...

Page 108: ...you remove one of the fans in the back row fans 3 4 or 5 replace it within 30 seconds to prevent overheating of the server Caution The fan module contains hazardous moving parts Unless the power to the server is completely shut down replacing the fan modules is the only service permitted in the fan compartment This procedure can be performed by customers while the server is running See Hot Service...

Page 109: ... and lift it out of the server Caution When removing a fan module do not rock it back and forth Rocking fan modules can cause damage to the fan board connectors Caution When changing fan modules note that only the fan modules can be removed or replaced Do not service any other components in the fan compartment unless the system is shut down and the power cords are removed Related Information Exten...

Page 110: ...module in the same slot from which the faulty fan was removed 1 Unpack the replacement fan module and place it on an antistatic mat 2 Install the replacement fan module into the server by completing the following tasks a Align the fan module and slide it into the fan slot Note Fan modules are keyed to ensure they are installed in the correct orientation ...

Page 111: ...the LEDs might stay lit until power is restored to the server and the server can determine that the fan module is functioning properly 3 Run the ILOM show faulty command to verify that the fault has been cleared See Managing Faults With ILOM on page 22 for more information on using the show faulty command 4 Perform one of the following tasks based on your verification results If the previous steps...

Page 112: ...100 SPARC T3 2 Server Service Manual December 2013 ...

Page 113: ...iency mode LLEM places PS1 in a warm standby condition while PS0 carries the entire load more efficiently by itself If PS0 loses AC power or is extracted for replacement PS1 takes over the load automatically Some rare internal failures of PS0 could cause the server to lose power faster than PS1 can take over Disabling the LLEM Policy causes the power supplies to share the load at all times at the ...

Page 114: ...ge 104 Install a Power Supply on page 106 Verify Power Supply Functionality on page 107 Locate a Faulty Power Supply This procedure describes how to identify a faulty power supply View the following LEDs which are lit when a power supply fault is detected Rear PS Fault LED on the front bezel of the server refer to Front Components on page 1 Service Action Required LED on the faulted power supply ...

Page 115: ... Components on page 1 Rear Components on page 3 Remove a Power Supply on page 104 Legend LED Symbol Color Lights When 1 Service Action Required Amber The power supply is faulty Service action is required 2 OK Green Both DC outputs 3 3V standby and 12V main are active and within regulation 3 AC Present Green This LED turns on when AC voltage is applied to the power supply ...

Page 116: ...wer supplies See Release the CMA on page 73 2 Disconnect the power cord from the power supply that displays an amber lit Service Action Required LED 3 Press down on the release latch to open the ejector arm 4 Slide the power supply out of the chassis Caution There is no catch mechanism on the power supply to prevent it from sliding completely out of the chassis Use care when removing the power sup...

Page 117: ...Servicing Power Supplies 105 Related Information Locate a Faulty Power Supply on page 102 Install a Power Supply on page 106 ...

Page 118: ...d shut down 1 Align the power supply with the empty power supply chassis bay 2 Slide the power supply into the bay until it is fully seated 3 Move the release latch up to secure the power supply in place 4 Reconnect the power cord to the power supply 5 Verify that the AC OK LED is lit See Locate a Faulty Power Supply on page 102 6 Verify that the following LEDs are not lit Service Action Required ...

Page 119: ... Managing Faults With ILOM on page 22 for more information on using the show faulty command 4 Perform one of the following tasks based on your verification results If the previous steps did not clear the fault see Detecting and Managing Faults on page 15 for information about the tools and methods you can use to diagnose component faults If the previous steps indicate that no faults have been dete...

Page 120: ...108 SPARC T3 2 Server Service Manual December 2013 ...

Page 121: ...ry Riser on page 120 Install a Memory Riser on page 121 Remove a DIMM on page 122 Install a DIMM on page 124 Increase Server Memory With Additional DIMMs on page 126 Remove a Memory Riser Filler Panel on page 128 Install a Memory Riser Filler Panel on page 129 Verify DIMM Functionality on page 130 DIMM Configuration Error Messages on page 132 About Memory Risers This section includes the following...

Page 122: ...mory Riser FRU Names on page 111 About DIMMs on page 112 Memory Riser Population Rules The memory riser population rules for the server are as follows Each memory riser slot in the server chassis must be filled with either a memory riser or memory riser filler panel arranged as illustrated in Memory Riser FRU Names on page 111 Each memory riser must be filled with DIMMs and or DIMM filler panels i...

Page 123: ...o memory risers are supported per CMP Memory risers associated with CMP0 SYS MB CMP0 MR0 SYS MB CMP0 MR1 Memory risers associated with CMP1 SYS MB CMP1 MR0 SYS MB CMP1 MR1 Labels on the memory riser cage indicate the corresponding memory riser FRU name Related Information Memory Risers Overview on page 110 Memory Riser Population Rules on page 110 About DIMMs on page 112 ...

Page 124: ...entical capacities all 4 Gbyte or all 8 Gbyte All DIMMs associated with each CMP must be identical identical capacity and identical rank classification For example install identical DIMMs in all slots of CMP0 MR0 and CMP0 MR1 and identical DIMMs in all slots of CMP1 MR0 and CMP1 MR1 To identify DIMM rank classification see DIMM Rank Classification Labels on page 115 The DIMM slots may be populated...

Page 125: ...ations on page 113 DIMM FRU Names on page 114 DIMM Rank Classification Labels on page 115 Increase Server Memory With Additional DIMMs on page 126 DIMM Configuration Error Messages on page 132 Supported Memory Configurations The following table describes supported memory configurations Note In half populated configurations DIMM filler panels must be installed in all unoccupied slots ...

Page 126: ...e 132 DIMM FRU Names DIMM FRU IDs are based on the location of the memory riser in the server and the DIMM slot on the memory riser For example the full FRU ID for the top most DIMM slot BOB1 CH1 D0 on the first memory riser CMP0 MR0 is SYS MB CMP0 MR0 BOB1 CH1 D0 TABLE Supported DIMM Configurations DIMM Capacity Number of DIMMs Installed Total Memory Capacity 4 GB 16 half population 64 GB 32 full...

Page 127: ...ssociated with each CMP must have identical rank classifications Each DIMM includes a printed label identifying its rank classification examples below Use these rank classification labels to identify the architecture of the DIMMs installed in the server as well as any replacment or upgrade DIMMs you intend to install TABLE DIMM FRU Identifyers DIMM FRU Identifyer From Top to Bottom on Memory Riser...

Page 128: ...s on the DIMMs For top cover removal instructions see Preparing for Service on page 63 The following table identifies the rank classification labels on supported DIMMs Related Information DIMM Population Rules on page 112 Supported Memory Configurations on page 113 DIMM FRU Names on page 114 Increase Server Memory With Additional DIMMs on page 126 DIMM Configuration Error Messages on page 132 Rank...

Page 129: ...cribes how to identify a faulty DIMM using these buttons and LEDs 1 Consider your first steps Familiarize yourself with DIMM configuration rules See DIMM Population Rules on page 112 Prepare the system for service See Preparing for Service on page 63 2 Press the Fault Remind button on the air divider to identify the memory riser containing the faulty DIMM as shown in the following figure ...

Page 130: ...Action Required LED is on amber one or more of the DIMMs installed on this riser is faulty or misconfigured 3 Remove the memory riser containing the faulty DIMM Place the memory riser on an ESD protected work surface See Remove a Memory Riser on page 120 4 Press the Remind button on the memory riser to identify the faulty DIMM An amber Fault LED will light next to the faulty DIMM ...

Page 131: ...show faulty Command on page 120 Remove a Memory Riser on page 120 No LED Color Description 1 Memory riser remind button Blue Push this button to identify the faulty or misconfigured DIMM s 2 Memory riser power LED Green Amber Indicates that the riser is operating normally Indicates that the riser has a fault 3 DIMM Fault LED Amber Identifies a faulty or misconfigured DIMM 4 DIMM keys Notches that ...

Page 132: ...by customers 1 Consider your first steps Familiarize yourself with DIMM configuration rules See DIMM Population Rules on page 112 Prepare the system for service See Preparing for Service on page 63 If you are replacing a faulty DIMM identify the affected memory riser See Locate a Faulty DIMM DIMM Fault Remind Button on page 117 2 Lift the memory riser straight up to remove the memory riser from th...

Page 133: ...Information About Memory Risers on page 109 Install a Memory Riser on page 121 Install a Memory Riser This is cold service procedure that can be performed by customers If you are upgrading the server with additional memory ensure that the memory risers are being installed in the correct slots See Memory Riser FRU Names on page 111 1 Push the memory riser into the associated memory riser slot until...

Page 134: ... Returning the Server to Operation on page 193 Verify DIMM functionality See Verify DIMM Functionality on page 130 Related Information About Memory Risers on page 109 Remove a Memory Riser on page 120 Verify DIMM Functionality on page 130 Remove a DIMM A DIMM is a cold service component that can be replaced by a customer ...

Page 135: ...e the memory riser containing the faulty DIMM s See Locate a Faulty DIMM DIMM Fault Remind Button on page 117 Remove the memory riser s See Remove a Memory Riser on page 120 2 Push down on the ejector tabs on each side of the DIMM until the DIMM is released Caution DIMMs and heat sinks on the motherboard might be hot 3 Grasp the top corners of the DIMM and lift it out of its slot 4 Place the DIMM ...

Page 136: ...on page 121 Related Information About DIMMs on page 112 Locate a Faulty DIMM DIMM Fault Remind Button on page 117 Locate a Faulty DIMM show faulty Command on page 120 Install a DIMM on page 124 Verify DIMM Functionality on page 130 DIMM Configuration Error Messages on page 132 Install a DIMM A DIMM is a cold service component that can be replaced by a customer Caution Do not leave DIMM slots empty...

Page 137: ...he DIMM into the connector until the ejector tabs lock the DIMM in place If the DIMM does not easily seat into the connector check the DIMM s orientation 6 Repeat Step 3 through Step 5 until all new DIMMs are installed 7 Finish the installation procedure Install the memory riser s See Install a Memory Riser on page 121 Return the server to operation See Returning the Server to Operation on page 19...

Page 138: ... system for service See Preparing for Service on page 63 Remove the memory risers See Remove a Memory Riser on page 120 2 Unpack the new DIMMs and place them on an antistatic mat 3 At a DIMM slot that is to be upgraded open the ejector tabs and remove the filler panel Do not dispose of the filler panel You may want to reuse it if any DIMMs are removed at another time 4 Ensure that the ejector tabs...

Page 139: ...8 Finish the installation procedure Install the memory risers See Install a Memory Riser on page 121 Return the server to operation See Returning the Server to Operation on page 193 Verify DIMM functionality See Verify DIMM Functionality on page 130 Related Information Memory Fault Handling on page 23 About DIMMs on page 112 Remove a DIMM on page 122 Install a DIMM on page 124 Verify DIMM Function...

Page 140: ... page 165 Caution All memory risers and memory riser filler panels must be installed in order to ensure proper server airflow This is a cold service procedure that can be performed by customers 1 Prepare the system for service See Preparing for Service on page 63 2 Locate the memory riser filler panel you want to remove 3 Lift the filler panel straight up to remove it from the memory module socket...

Page 141: ...l a Memory Riser Filler Panel Caution All memory risers and memory riser filler panels must be installed in order to ensure proper server airflow 1 Align the memory riser filler panel with the empty slot 2 Gently press the memory riser filler panel into the slot 3 Return the server to operation See Returning the Server to Operation on page 193 ...

Page 142: ...the set command to enable the DIMM that was disabled by POST In most cases replacement of a faulty DIMM is detected when the service processor is power cycled In those cases the fault is automatically cleared from the system If show faulty still displays the fault the set command will clear it 4 For a host detected fault perform the following steps to verify the new DIMM a Set the virtual keyswitc...

Page 143: ...point If so go directly to Step e If it remains at the ok prompt go to Step d d If the system remains at the ok prompt type boot e Return the virtual keyswitch to Normal mode f Switch to the system console and type the Oracle Solaris OS fmadm faulty command If any faults are reported refer to the diagnostics instructions described in Oracle ILOM Troubleshooting Overview on page 25 5 Switch to the ...

Page 144: ...M FRU Names on page 114 DIMM Configuration Error Messages on page 132 Oracle Integrated Lights Out Manager ILOM 3 1 Document Collection DIMM Configuration Error Messages This topic includes the following DIMM Configuration Errors System Console on page 133 DIMM Configuration Errors show faulty Command Output on page 135 DIMM Configuration Errors fmadm faulty Output on page 136 show faulty Target P...

Page 145: ...message is displayed In addition to these general memory configuration errors one or more rule specific messages is displayed indicating the type of configuration error detected The following table identifies and explains the various DIMM configuration error messages Note The messages described in this table apply to SPARC T3 2 servers The DIMM configuration requirements for other servers in the S...

Page 146: ... message Not all DIMMs have the same SDRAM capacity All DIMM components must have the same storage capacity all 4 Gbyte or all 8 Gbyte Replace any DIMMs that do not match the intended capacity See DIMM Population Rules on page 112 Not all DIMMs have the same device width All DIMM components must have the same device width Replace any DIMMs that do not match the intended width See DIMM Rank Classif...

Page 147: ...n page 112 DRAM capacity of DIMMs is different across nodes If the CMPs in a server have different DRAM capacities replace some DIMM components until all DRAM capacities are the same See DIMM Population Rules on page 112 Device width of DIMM is different across nodes If the CMPs in a server have DIMMs with different device widths replace some DIMM components until all DIMMs have the same device wi...

Page 148: ...00 faults 0 SP faultmgmt 0 fru_serial_number 00CE011217213B151F faults 0 SP faultmgmt 0 product_serial_number 1140BDY90A faults 0 SP faultmgmt 0 chassis_serial_number 1140BDY90A faults 0 faultmgmtsp fmadm faulty Time UUID msgid Severity FRU SYS MB CMP0 MR0 BOB0 CH0 D0 Part Number 000 0000 Serial Number 00CE011217213B151F Description A FRU has been inserted into a location where it is not supported...

Page 149: ... Information DIMM Configuration Errors System Console on page 133 DIMM Configuration Errors show faulty Command Output on page 135 2012 09 24 17 00 40 2caf416b 4fe0 6509 db02 aa719f5dd543 SPT 8000 PX Major Fault class fault component misconfigured ...

Page 150: ...138 SPARC T3 2 Server Service Manual December 2013 ...

Page 151: ... Install a DVD Drive or Filler Panel on page 141 DVD Drive Overview The SATA DVD drive is mounted in a removable module that is accessed from the system s front panel The DVD module must be removed from the hard drive cage in order to service the hard disk drive backplane Related Information Remove a DVD Drive or Filler Panel on page 140 Install a DVD Drive or Filler Panel on page 141 ...

Page 152: ...ch an antistatic wrist strap b Remove any media from the drive c Power off the server and unplug power cords from the power supplies See Removing Power From the System on page 69 2 Push down on the latch on the top left corner of the DVD drive or filler panel 3 Slide the DVD drive or filler panel out of the server Caution Whenever you remove the DVD drive or filler panel you should replace it with...

Page 153: ...a DVD drive attach an antistatic wrist wrap and place the drive on an antistatic mat 2 Slide the DVD drive or filler panel into the front of the chassis until it seats 3 Return the server to operation a Return the server to the normal rack position See Return the Server to the Normal Rack Position on page 197 ...

Page 154: ...r Service Manual December 2013 b Reinstall the power cords to the power supplies and power on the server See Returning the Server to Operation on page 195 Related Information Remove a DVD Drive or Filler Panel on page 140 ...

Page 155: ...erver System Battery Overview on page 143 Remove the System Battery on page 144 Install the System Battery on page 146 System Battery Overview The system battery maintains system time when the server is powered off and disconnected from AC power If the IPMI logs indicate a battery failure replace the system battery ...

Page 156: ...g this procedure See Cold Service Replaceable by Customer on page 67 for more information about cold service procedures 1 Prepare for servicing a Attach an antistatic wrist strap b Power off the server and unplug power cords from the power supplies See Removing Power From the System on page 69 c Extend the server to maintenance position See Extend the Server to the Maintenance Position on page 72 ...

Page 157: ...Servicing the System Lithium Battery 145 Related Information Install the System Battery on page 146 ...

Page 158: ...he server is powered on and connected to the network Otherwise proceed to the next step 4 If the service processor is not configured to use NTP you must reset the ILOM clock using the ILOM CLI or the web interface For instructions see the Oracle Integrated Lights Out Manager ILOM 3 0 Documentation Collection 5 Return the server to operation a Return the server to the normal rack position See Retur...

Page 159: ...ten PCI Express 2 0 slots that accommodate low profile PCIe cards All slots support x8 PCIe cards Two slots are also capable of supporting x16 PCIe cards Slots 4 and 5 x4 electrical interface Slots 0 1 2 7 8 and 9 x8 electrical interface Slots 3 and 6 x8 electrical interface x16 connector To determine the slot in which to install a PCIe card follow these guidelines First consider any cooling consi...

Page 160: ...Ie Card Filler Panel on page 148 Remove a PCIe Card on page 150 Install a PCIe Card on page 151 Install a PCIe Card Filler Panel on page 155 Remove a PCIe Card Filler Panel Caution This procedure requires that you handle components that are sensitive to static discharge This sensitivity can cause the component to fail To avoid damage follow antistatic practices as described in ESD Measures on page...

Page 161: ...ng the following tasks a Disengage the PCIe card slot crossbar from its locked position and rotate the crossbar to an upright position b Carefully remove the PCIe card filler panel from the card slot Caution Whenever you remove a PCIe card filler panel replace it with another filler panel or a PCIe card otherwise the server might overheat due to improper airflow Related Information Remove a PCIe C...

Page 162: ...dures 1 Prepare for servicing a Attach an antistatic wrist strap b Power off the server and disconnect all power cords from the server power supplies See Removing Power From the System on page 69 c Extend the server to the maintenance position See Extend the Server to the Maintenance Position on page 72 d Remove the top cover See Remove the Server Top Cover on page 76 2 Locate the PCIe card that y...

Page 163: ... Card Filler Panel on page 155 Install a PCIe Card Caution This procedure requires that you handle components that are sensitive to static discharge This sensitivity can cause the component to fail To avoid damage ensure that you follow antistatic practices as described in ESD Measures on page 63 Note If the server has PCIe cards installed that provide bootable devices disable Option ROM on the PC...

Page 164: ...you are not replacing an existing PCIe card and need information about deciding into which slot a the card should be installed refer to PCIe Card Configuration Rules on page 147 4 Connect any internal cables to the PCIe card If you are replacing a faulty PCIe card re connect any cables you disconnected when removing the card For SAS HBA PCIe cards see Cable an Internal SAS HBA PCIe Card on page 15...

Page 165: ...cover RAID configurations refer to the LSI MegaRAID SAS Software User s Guide which is available at the following location http www lsi com support sun Related Information Remove a PCIe Card Filler Panel on page 148 Remove a PCIe Card on page 150 Install a PCIe Card Filler Panel on page 155 Cable an Internal SAS HBA PCIe Card After installing an optional internal SAS HBA PCIe card into the server ...

Page 166: ...eled DISK4 7 and connect it to the bottom HBA port 4 Continue with the PCIe card installation and return the server to operation See Step 5 in Install a PCIe Card on page 151 Related Information PCIe Card Configuration Rules on page 147 Install a PCIe Card on page 151 The SAS HBA PCIe card documentation ...

Page 167: ...SD Measures on page 63 1 Ensure that the server is powered off and all power cords are disconnected from the server power supplies See Removing Power From the System on page 69 2 Attach an antistatic wrist strap unpack the PCIe card and place it on an antistatic mat 3 Install the PCIe filler panel into the card slot opening and return the PCIe card slot crossbar to its closed and locked position 4...

Page 168: ...he Normal Rack Position on page 197 c Reconnect all power cords to the server power supplies See Connect Power Cords to the Server on page 198 d Power on the server See Power On the Server on page 198 Related Information Remove a PCIe Card Filler Panel on page 148 Remove a PCIe Card on page 150 Install a PCIe Card on page 151 ...

Page 169: ...ures on page 63 This is a cold service procedure that must be performed by qualified service personnel The system must be completely powered down before performing this procedure See Cold Service Replaceable by Authorized Service Personnel on page 67 for more information about this category of service procedures 1 Prepare for servicing a Attach an antistatic wrist strap b Power off the server and ...

Page 170: ...ront of the server 5 Remove the fan board by completing the following tasks a Loosen the three captive screws connecting the front memory riser guide to the motherboard b Remove the two screws on each side of the outside of the chassis that hold the fan board in place and unplug the fan board and power cables from motherboard c Remove the front memory riser guide by pulling it up and out of the ch...

Page 171: ...move the fan board cable and power cables from the faulty fan board unit and plug them into the fan board on the replacement fan board unit 3 Reinstall the fan board unit by completing the following tasks a Insert the fan board into the chassis moving it down and toward the front b Reposition the front memory riser guide routing the fan board and power cable through the riser guide ...

Page 172: ...1 6 Return the server to operation a Install the top cover See Install the Top Cover on page 196 b Return the server to the normal rack position See Return the Server to the Normal Rack Position on page 197 c Reinstall the power cords to the power supplies and power on the server See Returning the Server to Operation on page 195 Note The product serial number used for service entitlement and warra...

Page 173: ...show faulty command 2 Perform one of the following tasks based on your verification results If the previous steps did not clear the fault see Detecting and Managing Faults on page 15 for information about the tools and methods you can use to diagnose component faults If the previous steps indicate that no faults have been detected then the component has been replaced successfully No further action...

Page 174: ...162 SPARC T3 2 Server Service Manual December 2013 ...

Page 175: ...pics explain how to remove and install the motherboard Motherboard Overview on page 164 Remove the Motherboard on page 164 Install the Motherboard on page 167 Reactivate RAID Volumes on page 170 Verify Motherboard Functionality on page 173 ...

Page 176: ...t be compatible After replacing the motherboard the host firmware on the motherboard might be incompatible with the service processor firmware on the service processor that you transferred to the new motherboard In this case the system firmware must be loaded as described in Install the Motherboard on page 167 Related Information Understanding Component Replacement Categories on page 66 Remove the...

Page 177: ...ee Removing Power From the System on page 69 c Remove the server from the rack See Remove the Server From the Rack on page 75 d Remove the top cover See Remove the Server Top Cover on page 76 2 Remove the System Configuration PROM from the motherboard so you can reinstall it on the new motherboard 3 Remove all memory risers and filler panels See Remove a Memory Riser on page 120 4 Remove the Syste...

Page 178: ...therboard to the power supply backplane See Remove the Power Supply Backplane on page 189 8 Remove all PCIe cards from the server See Remove a PCIe Card on page 150 9 Position the HDD end of the cables off to the side using the tab on the top of the plastic power supply cover 10 Remove the motherboard by completing the following tasks a Loosen the captive screw in the corner near the fans that sec...

Page 179: ... grasping the handle on the motherboard and sliding it toward the back of the chassis 5 Reinsert and tighten the four bus bar screws that secure the motherboard to the power supply backplane See Install the Power Supply Backplane on page 191 Note Using a No 2 screwdriver tighten the bus bar screws until the power supply backplane and the motherboard securely fasten to the bus bars 6 Reinstall the ...

Page 180: ...1 Tighten the captive screw in the corner near the fans that secures the motherboard to the chassis 12 On the replacement motherboard install the System Configuration PROM that you removed from the old motherboard 13 Install the top cover See Install the Top Cover on page 196 14 Return the server to the normal rack position See Return the Server to the Normal Rack Position on page 197 15 Reinstall...

Page 181: ...g in to the service processor through the NET MGT port c Download the system firmware Follow the firmware download instructions in the Oracle ILOM documentation Note You can load any supported system firmware version including the firmware version that had been installed prior to replacing the motherboard 18 If necessary reactivate any RAID volumes that existed prior to replacing the motherboard I...

Page 182: ...ILOM prompt disable auto boot so that the system will not boot the OS when the system powers on 3 Power on the server See Power On the Server on page 198 4 At the OpenBoot PROM prompt use the show devs command to list the device paths on the server You can also use the devalias command to locate device paths specific to your server set HOST bootmode script setenv auto boot false ok show devs pci 4...

Page 183: ...olumes Where inactive_volume is the name of the RAID volume that you are activating For example Note For more information on configuring hardware RAID on the server refer to the SPARC T3 Series Servers Administration Guide 8 Use the unselect dev command to unselect the scsi device ok select scsi ok show volumes ok show volumes Volume 0 Target 389 Type RAID1 Mirroring WWID 03b2999bca4dc677 Optimal ...

Page 184: ...00 pci 2 pci 0 pci e scsi 0 FCode Version 1 00 54 MPT Version 2 00 Firmware Version 5 00 17 00 Target a Unit 0 Removable Read Only device TEAC DV W28SS R 1 0C SATA device PhyNum 3 Target b GB Unit 0 Disk SEAGATE ST914603SSUN146G 0868 286739329 Blocks 146 SASDeviceName 5000c50016f75e4f SASAddress 5000c50016f75e4d PhyNum 1 Target 389 Volume 0 Unit 0 Disk LSI Logical Volume 3000 583983104 Blocks 298 ...

Page 185: ...e show faulty command 2 Perform one of the following tasks based on your verification results If the previous steps did not clear the fault see Detecting and Managing Faults on page 15 for information about the tools and methods you can use to diagnose component faults If the previous steps indicate that no faults have been detected then the component has been replaced successfully No further acti...

Page 186: ...174 SPARC T3 2 Server Service Manual December 2013 ...

Page 187: ...ribe service procedures for the service processor in the server Service Processor Firmware and Configuration on page 176 Remove the Service Processor on page 176 Install the Service Processor on page 179 Verify Service Processor Functionality on page 181 ...

Page 188: ...M backup utility Refer to the Oracle ILOM documentation for instructions on backing up and restoring the Oracle ILOM configuration After replacing the service processor the new service processor firmware component must be compatible with the existing host firmware component If the firmware components are incompatible load new system firmware as described in Install the Service Processor on page 17...

Page 189: ...or note the current version before removing the service processor Replacing the service processor is a cold service procedure that must be performed by qualified service personnel The system must be completely powered down before performing this procedure See Cold Service Replaceable by Authorized Service Personnel on page 67 for more information about this category of service procedures The amber...

Page 190: ...s you will need to reinstall it on the new motherboard a Grasp the service processor by the two grasp points and lift up to disengage the service processor from the connectors on the motherboard b Lift the service processor up and away from the motherboard Related Information Service Processor Firmware and Configuration on page 176 Install the Service Processor on page 179 ...

Page 191: ...processor tab on the motherboard b Press the service processor straight down until it is fully seated in its socket 2 Return the server to an operational condition a Install the top cover See Install the Top Cover on page 196 b Return the server to the normal rack position See Return the Server to the Normal Rack Position on page 197 c Reinstall the power cords to the power supplies See Connect Po...

Page 192: ...ation for network configuration instructions b Download the system firmware Follow the firmware download instructions in the Oracle ILOM documentation Note You can load any supported system firmware version including the firmware version that was installed prior to replacing the service processor c If you created a backup of your Oracle ILOM configuration use the ILOM restore utility to restore th...

Page 193: ...ow faulty command to verify that the fault has been cleared See Managing Faults With ILOM on page 22 for more information on using the show faulty command 3 Perform one of the following tasks based on your verification results If the previous steps did not clear the fault see Detecting and Managing Faults on page 15 for information about the tools and methods you can use to diagnose component faul...

Page 194: ...182 SPARC T3 2 Server Service Manual December 2013 ...

Page 195: ...onnel The system must be completely powered down before performing this procedure See Cold Service Replaceable by Authorized Service Personnel on page 67 for more information about this category of service procedures 1 Prepare for servicing a Attach an antistatic wrist strap b Power off the server and unplug power cords from the power supplies See Removing Power From the System on page 69 c Remove...

Page 196: ...ly air divider by lifting it up and away from the power supplies 5 Remove the hard disk drive backplane by completing the following tasks a Unplug the two SAS cables power cables and ribbon cable from the hard disk drive backplane and push up on the wire tab in the upper corner of the hard disk drive backplane b Swing the hard disk drive backplane back and out of the chassis Related Information In...

Page 197: ...ook and press the hard disk drive backplane to the front until it snaps into place 4 Replace the power cable ribbon data cable and SAS cables to their original locations Note The mini SAS plug must be inserted into the upper mini SAS connector on the disk backplane This short cable connects the DVD to its USB bridge on the motherboard The longer SAS cable connects drive bays 4 and 5 to a storage d...

Page 198: ...server to the normal rack position See Return the Server to the Normal Rack Position on page 197 c Power on the server See Returning the Server to Operation on page 195 Note The product serial number used for service entitlement and warranty coverage might need to be reprogrammed on the disk backplane by authorized service personnel with the correct product serial number located on the chassis EZ ...

Page 199: ...on on using the show faulty command 2 Perform one of the following tasks based on your verification results If the previous steps did not clear the fault see Detecting and Managing Faults on page 15 for information about the tools and methods you can use to diagnose component faults If the previous steps indicate that no faults have been detected then the component has been replaced successfully N...

Page 200: ...188 SPARC T3 2 Server Service Manual December 2013 ...

Page 201: ... before servicing the power distribution board This is a cold service procedure that must be performed by qualified service personnel The system must be completely powered down before performing this procedure See Cold Service Replaceable by Authorized Service Personnel on page 67 for more information about this category of service procedures 1 Prepare for servicing a Attach an antistatic wrist st...

Page 202: ...o the motherboard 6 Remove the power supply backplane by completing the following tasks a Remove the screw that holds the power supply backplane cover in place and remove the power supply cover b Remove the four bus bar screws securing the motherboard to the power supply backplane and remove the motherboard to gain access to the bus bars This will also involve disconnecting all cables that connect...

Page 203: ...supply cage and connect the AC cables to the AC connectors on the power supply backplane Ensure that each AC cable is connected to the appropriate connector The AC cable on the right must be connected to the AC connector on the right and the AC cable on the left must be connected to the AC connector on the left 3 Insert the power supply backplane into position by completing the following tasks a E...

Page 204: ...econnect the ribbon cable from the motherboard to the power supply backplane 5 Reinstall the air divider by sliding it into the chassis 6 Reinstall the memory risers or filler panels See Install a Memory Riser on page 121 7 Push the power supplies all the way back into the chassis See Install a Power Supply on page 106 8 Return the server to operation a Install the top cover See Install the Top Co...

Page 205: ...on using the show faulty command 2 Perform one of the following tasks based on your verification results If the previous steps did not clear the fault see Detecting and Managing Faults on page 15 for information about the tools and methods you can use to diagnose component faults If the previous steps indicate that no faults have been detected then the component has been replaced successfully No f...

Page 206: ...194 SPARC T3 2 Server Service Manual December 2013 ...

Page 207: ...lain how to return the server to operation after you have performed service procedures Install the Top Cover on page 196 Return the Server to the Normal Rack Position on page 197 Connect Power Cords to the Server on page 198 Power On the Server on page 198 ...

Page 208: ...is Set the cover down so that it is slightly forward of the rear of the server by about 1 inch 25 4 mm 2 Slide the top cover toward the rear of the chassis until the rear cover lip engages with the rear of the chassis 3 To close the top cover press down on the cover with both hands until both latches engage ...

Page 209: ...e fully extended position by pushing the release tabs on the side of each rail 2 While pushing on the release tabs slowly push the server into the rack Ensure that the cables do not get in the way 3 Reconnect the cables to the back of the server If the cable management arm CMA is in the way disconnect the left CMA release and swing the CMA open 4 Reconnect the CMA Swing the CMA closed and latch it...

Page 210: ...ervice processor boots The SP OK Fault LED is illuminated solid green when the service processor has successfully booted After the service processor has booted the Power OK LED on the front panel begins flashing slowly indicating that the host is in standby power mode 2 Power on the server using one of the following steps Press and release the recessed Power button on the server front panel Log in...

Page 211: ...d the cover installed Severe damage to server components can occur if the server is operated without adequate cooling mechanisms Related Information Access the Service Processor Oracle ILOM on page 24 Power Off the Server Service Processor Command on page 69 Power Off the Server Power Button Standby Mode on page 70 Figure Legend 1 Power OK LED 2 Power Button 3 SP OK Fault LED ...

Page 212: ...200 SPARC T3 2 Server Service Manual December 2013 ...

Page 213: ... Glossary B BMC baseboard management controller C CMA cable management arm CMP chip level multiprocessor D DHCP Dynamic Host Configuration Protocol DTE data terminal equipment E ESD electrostatic discharge ...

Page 214: ...d replaceable unit H HBA host bus adapter I ILOM Oracle Integrated Lights Out Manager IP Internet Protocol N NET MGT network management port NIC network interface card or controller O Oracle Solaris OS Oracle Solaris Operation System POST POST power on self test ...

Page 215: ...gable S SAS serial attached SCSI SER MGT serial management port SP service processor SSD sold state drive SSH Secure Shell U UI user interface UUID Universal Unique Identifier W WWID world wide identifier A unique number that identifies a SAS target ...

Page 216: ...204 SPARC T3 2 Server Service Manual December 2013 ...

Page 217: ...el with powering down 67 components disabled automatically by POST 55 displaying using showcomponent command 56 configuring how POST runs 48 PCIe cards 147 D DB 15 video connector location of 3 default Oracle ILOM password 24 diag level parameter 46 diag mode parameter 46 diag trigger parameter 46 diag verbosity parameter 46 diagnostics low level 45 running remotely 22 DIMMs classification labels ...

Page 218: ...H detected checking for 40 filler panels installing DVD drive 141 hard disk drives 90 memory riser 129 PCIe card 155 removing DVD drive 140 hard disk drives 84 memory riser 128 PCIe card 148 fmadm command 42 fmadm faulty command 131 136 fmdump command 40 front panel features location of 2 FRU ID PROMs 23 FRU information displaying 27 G graceful shutdown defined 70 H hard disk drive backplane FRU n...

Page 219: ...sting with POST 49 memory riser filler panels installing 129 removing 128 memory risers FRU names 109 installing 121 location of 6 physical layout 109 population rules 110 removing 120 message buffer checking the 37 message identifier 40 messages POST fault 50 motherboard about 164 FRU name 9 installing 167 location of 6 reactivate RAID volumes 170 removing 164 verifying function of replaced 173 N...

Page 220: ...or filler panel 140 fan board 157 fan modules 96 hard disk drive backplane 183 hard disk drive filler panels 84 hard disk drives 86 memory riser filler panels 128 memory risers 120 motherboard 164 PCIe card filler panels 148 PCIe cards 150 power supplies 104 power supply backplane 189 service processor 176 system battery 144 top cover 76 replaceable component locations 5 RJ 45 serial port location...

Page 221: ...g 37 system status LEDs locations of 3 T top cover installing 196 removing 76 troubleshooting by checking Oracle Solaris OS log files 18 using POST 19 using SunVTS 18 using the show faulty command 18 U Universal Unique Identifier UUID 40 USB ports FRU name 11 location of front 2 rear 3 V verifying function of replaced DIMMs 130 fan board 161 fan modules 99 hard disk drive backplane 187 hard disk d...

Page 222: ...210 SPARC T3 2 Server Service Manual December 2013 ...

Reviews: