background image

Chapter 2

Sun Fire T2000 Server Diagnostics

33

FIGURE 2-7

Flowchart of ALOM Variables for POST Configuration

Summary of Contents for Netra T2000

Page 1: ...Sun Microsystems Inc www sun com Submit comments about this document at http www sun com hwdocs feedback Sun Fire T2000 Server Service Manual Part No 819 2548 10 October 2005 Revision A ...

Page 2: ...NVALID Copyright 2005 Sun Microsystems Inc 4150 Network Circle Santa Clara Californie 95054 Etats Unis Tous droits réservés Sun Microsystems Inc a les droits de propriété intellectuels relatants à la technologie qui est décrit dans ce document En particulier et sans la limitation ces droits de propriété intellectuels peuvent inclure un ou plus des brevets américains énumérés à http www sun com pat...

Page 3: ...Reliability Availability and Serviceability 5 Hot Pluggable and Hot Swappable Components 6 Power Supply Redundancy 6 Fan Redundancy 6 Environmental Monitoring 7 Error Correction and Parity Checking 7 Predictive Self Healing 7 Chassis Identification 9 Additional Service Related Information 10 2 Sun Fire T2000 Server Diagnostics 11 Overview of Sun Fire T2000 Server Diagnostics 12 Using LEDs to Ident...

Page 4: ...onment Command 27 To Run the showfru Command 29 Running POST 31 Controlling How POST Runs 31 To Change POST Parameters 34 Reasons to Run POST 35 Routine Sanity Check of the Hardware 35 Diagnosing the System Hardware 35 To Run POST 35 Using the Solaris Predictive Self Healing Feature 40 To Use the fmdump Command to Identify Faults 41 Collecting Information From Solaris OS Files and Commands 43 To C...

Page 5: ...t Swappable and Hot Pluggable FRUs 55 Devices That Are Hot Swappable and Hot Pluggable 56 Hot Swapping a Fan 56 To Remove a Fan 57 To Replace a Fan 58 Hot Swapping a Power Supply 58 To Remove a Power Supply 58 To Replace a Power Supply 60 Hot Swapping the Rear Blower 61 To Remove the Rear Blower 61 To Replace the Rear Blower 61 Hot Plugging a Hard Drive 63 To Remove a Hard Drive 63 To Replace a Ha...

Page 6: ...emoving and Replacing FRUs 74 To Remove PCI E and PCI X Cards 75 To Replace PCI Cards 77 To Remove DIMMs 77 To Replace DIMMs 79 To Remove the System Controller 82 To Replace the System Controller Board 83 To Remove the Motherboard Assembly 84 To Replace the Motherboard Assembly 88 To Remove the Power Distribution Board 90 To Replace the Power Distribution Board 92 To Remove the LED Board 93 To Rep...

Page 7: ...over and Front Bezel 103 To Replace the Top Cover 104 To Reinstall Server Chassis in the Rack 104 To Return the Server to the Normal Rack Position 105 To Apply Power to the Server 107 5 Adding New Components and Devices 109 Adding Hot Pluggable and Hot Swappable Devices 110 To Add a Hard Drive to the Server 110 To Add a USB Device 111 Adding Components Inside the Chassis 113 To Add DIMMs 113 To Ad...

Page 8: ...viii Sun Fire T2000 Server Service Manual October 2005 ...

Page 9: ... drives and memory to the server This manual is written for technicians service personnel and system administrators who service and repair computer systems The person qualified to use this manual Can open a system chassis identify and replace internal components Understands the Solaris Operating System and the command line interface Has superuser privileges for the system being serviced Understand...

Page 10: ...for monitoring and diagnosing the Sun Fire T2000 server Chapter 3 explains how to remove and replace hot swappable and hot pluggable field replaceable units FRUs Chapter 4 describes how to remove and replace the FRUs that cannot be hot swapped Chapter 5 explains how to add new components such as hard drives memory and PCI cards to the Sun Fire T2000 server Appendix A provides an illustrated breakd...

Page 11: ...erview Overview of the features of this server 819 2543 Sun Fire T2000 Server Getting Started Guide Information about where to find documentation to get your system installed and running quickly 819 2542 Sun Fire T2000 Server Installation Guide Detailed rackmounting cabling power on and configuration information 819 2546 Sun Fire T2000 Server Administration Guide How to perform administrative task...

Page 12: ...files and directories on screen computer output Edit your login file Use ls a to list all files You have mail AaBbCc123 What you type when contrasted with on screen computer output su Password AaBbCc123 Book titles new words or terms words to be emphasized Replace command line variables with real names or values Read Chapter 6 in the User s Guide These are called class options You must be superuse...

Page 13: ...he use of or reliance on any such content goods or services that are available on or through such sites or resources Contacting Sun Technical Support If you have technical questions about this product that are not answered in this document go to http www sun com service contacting Sun Welcomes Your Comments Sun is interested in improving its documentation and welcomes your comments and suggestions...

Page 14: ...xiv Sun Fire T2000 Server Service Manual October 2005 ...

Page 15: ...un Fire T2000 Server Overview This chapter provides an overview of the features of the Sun Fire T2000 server The following topics are covered Sun Fire T2000 Server Features on page 2 Chassis Identification on page 9 ...

Page 16: ... the Sun Fire T2000 server The UltraSPARC T1 processor is based on chip multithreading CMT technology that is optimized for highly threaded transactional processing The UltraSPARC T1 processor improves throughput while using less power and dissipating less heat than conventional processor designs Depending on the model purchased the processor has four six or eight UltraSPARC cores Each core equate...

Page 17: ...essor components such as L1 cache L2 cache memory access crossbar DDR2 memory controllers and a JBus I O interface have been carefully tuned for optimal performance FIGURE 1 2 Motherboard and UltraSPARC T1 Multicore Processor UltraSPARC T1 multicore processor ...

Page 18: ...rnal hard disk drives 1 4 SFF SAS drives 2 5 inch form factor Other internal peripherals 1 slimline DVD drive USB ports 4 USB 1 1 ports 2 in front and 2 in rear Cooling 3 hot swappable and redundant system fans and 1 blower unit PCI interfaces 3 PCI Express PCI E slots for low profile cards supports 1x 4x and 8x width cards 2 PCI X slots for 64 bit 133 MHz low profile cards Note One PCI X slot is ...

Page 19: ...herefore ALOM firmware and software continue to function when the server operating system goes offline or when the server is powered off ALOM monitors the following Sun Fire T2000 server components CPU temperature conditions Hard drive status Enclosure thermal conditions Fan speed and status Power supply status Voltage levels Faults detected by POST Power On Self Test Solaris Predictive Self Heali...

Page 20: ...lies and the rear blower Using the proper software commands you can install or remove these components while the system is running Hot plug and hot swap technology significantly increases the system s serviceability and availability by providing the ability to replace hard drives fan units rear blower and power supplies without service disruption Power Supply Redundancy The Sun Fire T2000 server f...

Page 21: ...o the system controller SC console and are logged in the ALOM log file Additionally some FRUs such as power supplies provide LEDs that indicate a failure within the FRU Error Correction and Parity Checking The UltraSPARC T1 multicore processor provides parity protection on its internal cache memories including tag parity and data parity on the D cache and I cache The internal 3MB L2 cache has pari...

Page 22: ...rors and automatically and silently diagnoses the underlying problem Once a problem is diagnosed a set of agents automatically responds by logging the event and if necessary takes the faulty component offline By automatically diagnosing problems business critical applications and essential system services can continue uninterrupted in the event of software failures or major hardware component fail...

Page 23: ...3 Sun Fire T2000 Server Front Panel FIGURE 1 4 Sun Fire T2000 Server Rear Panel Hard drives DVD drive Drive 0 Indicators and buttons Drive 2 Drive 1 Drive 3 2 3 USB ports Power Power PCI E slot SC serial mgt SC net mgt Indicators PCI X slots TTYA serial GBE ports 2 0 1 3 0 1 USB ports port PCI E slots port port Slot 0 Slot 2 Slot 0 Slot 1 Slot 1 supply 1 supply 0 ...

Page 24: ...on Release Notes The Solaris OS release Notes contain important information about the Solaris OS The release notes are available online at http www sun com documentation SunSolve Online Provides a collection of support resources Depending on the level of your service contract you have access to Sun patches the Sun System Handbook the SunSolve knowledge base the Sun Support Forum and additional doc...

Page 25: ... service personnel and system administrators who service and repair computer systems The following topics are covered Overview of Sun Fire T2000 Server Diagnostics on page 12 Using LEDs to Identify the State of Devices on page 16 Using ALOM For Diagnosis and Repair Verification on page 22 Running POST on page 31 Using the Solaris Predictive Self Healing Feature on page 40 Collecting Information Fr...

Page 26: ...urately predict component failures and mitigate many serious problems before they occur Log files and console messages Provide the standard Solaris OS log files and investigative commands that can be accessed and displayed on the device of your choice SunVTS An application that exercises the system provides hardware validation and discloses possible faulty components with recommendations for repai...

Page 27: ...d 6 Do the Solaris logs indicate a faulty FRU 7 Does POST report any faulty devices 8 Does SunVTS report any faulty devices 11 Perform recom mended corrective actions If needed contact Sun for Support Numbers in this flow chart correspond to the Action numbers in Table 2 1 3 Enter the message ID into the Sun Knowl edge Article web site for recommended actions 10 Verify the repair 2 Is a fault mess...

Page 28: ...n page 40 4 Analyze the suggested actions In some cases fault related messages are identified with suggested actions If the suggested action recommends replacing a FRU go to Action 9 If the suggested action does not recommend replacing a FRU perform the suggested action Contact Sun for additional support if needed Sun Support information http www sun com service contacting 5 Do any of the fault LE...

Page 29: ...air Various commands and utilities can be used to verify the functionality of the system components Two useful commands are The ALOM showfaults command The ASR showcomponents command If the FRU is blacklisted you can manually remove it from the black list with the enablecomponent command If the fault is cleared and the component is not blacklisted the repair is verified well enough to boot the ser...

Page 30: ...TABLE 2 5 Hard Drive LEDs TABLE 2 3 These LEDs provide a quick visual check of the state of the system Front and Rear Panel LEDs The six front panel LEDs FIGURE 2 2 are located in the upper left corner of the server chassis Three of these LEDs are also provided on the rear panel FIGURE 2 3 FIGURE 2 2 Front Panel LEDs Locator LED button Power OK LED Top Fan LED Rear FRUFault LED Over Temp LED Servi...

Page 31: ...e I am Service Required LED Amber If on indicates that service is required The ALOM showfaults command provides details about any faults that cause this indicator to be lit Power OK LED Green The LED provides the following indications Off The system is unavailable Either it has no power or ALOM is not running Steady on Indicates that the system is powered on and is running it its normal operating ...

Page 32: ...vice Rear FRU FAULT LED Amber Provides the following indications Off Indicates a steady state no service action is required Steady on Indicates a failure of a rear access FRU a power supply or the rear blower Use the FRU LEDs to determine which FRU requires service OverTemp LED Amber Provides the following operational temperature indications Off Indicates a steady state no service action is requir...

Page 33: ...d in the Sun Fire T2000 server chassis FIGURE 2 4 Hard Drive LEDs TABLE 2 3 Hard Drive LEDs LED Color Description OK to Remove Blue On The drive is ready for hot plug removal Off Normal operation Unused Amber Activity Green On Drive is receiving power Solidly lit if drive is idle Flashes while the drive processes a command Off Power is off Activity unused OK to Remove ...

Page 34: ...wer Supply LEDs TABLE 2 4 Power Supply LEDs LED Color Description Power OK Green On Normal operation DC output voltage is within normal limits Off Power is off Failure Amber On Power supply has detected a failure Off Normal operation AC OK Green On Normal operation Input power is within normal limits Off No input voltage or input voltage is below limits Power OK Failure AC OK ...

Page 35: ...back of the blower unit and visible from the rear of the server TABLE 2 6 TABLE 2 5 Fan LEDs LED Color Description Fan LEDs Amber On This fan is faulty Off Normal operation Note When a fan fault is detected the front panel Top Fan LED is lit TABLE 2 6 Blower Unit LED LED Color Description Blower Unit LED Amber On The blower unit is faulty Off Normal operation Note When a blower fault is detected t...

Page 36: ...server s standby power Therefore ALOM firmware and software continue to function when the server operating system goes offline or when the server is powered off Note Refer to the Sun Fire T2000 Server Advanced Lights Out Manager ALOM Guide for comprehensive ALOM information Faults detected by ALOM POST and the Solaris Predictive Self healing PSH technology are forwarded to the ALOM for fault handl...

Page 37: ...r certain types of faults without a FRU replacement or if ALOM was unable to automatically detect the FRU replacement ALOM does not automatically detect hard drive replacement Many environmental faults can automatically recover A temperature that is exceeding a threshold may return to normal limits An unplugged a power supply can be plugged in and so on Recovery of environmental faults is automati...

Page 38: ...terminal directly to the serial management port Use the telnet command to connect to ALOM through an Ethernet connection on the network management port Connect an external modem to the network management port and dial in to the modem Note Refer to the Sun Fire T2000 Server Advanced Lights Out Manager ALOM Guide for instructions on configuring and connecting to ALOM Switching Between the System Con...

Page 39: ...ver removefru Indicates if it is OK to perform a hot swap of a power supply reset y Generates a hardware reset on the host server The y option enables you to skip the confirmation question resetsc y Reboots the system controller The y option enables you to skip the confirmation question setkeyswitch normal stby diag locked Sets the virtual keyswitch setlocator on off Turns the Locator LED on the s...

Page 40: ...rboard MB 2 Use the Sun message ID to obtain more information about the fault In a browser go to the Predictive Self Healing Knowledge Article web site http www sun com msg and enter the Sun message ID in the lookup field showkeyswitch Displays the status of the virtual keyswitch showlocator Displays the current state of the Locator LED as either on or off showlogs b lines e lines g lines v Displa...

Page 41: ...ent command The output differs according to your system s model and configuration Example sc showenvironment Environmental Status System Temperatures Temperatures in Celsius Sensor Status Temp LowHard LowSoft LowWarn HighWarn HighSoft HighHard PDB T_AMB OK 23 10 5 0 45 50 55 MB T_AMB OK 26 10 5 0 50 55 60 MB CMP0 T_TCORE OK 44 10 5 0 85 95 100 MB CMP0 T_BCORE OK 45 10 5 0 85 95 100 IOBD IOB TCORE ...

Page 42: ... 0 86 0 93 0 95 MB V_VTTR OK 0 87 0 84 0 86 0 93 0 95 MB V_ 3V3STBY OK 3 33 3 13 3 16 3 53 3 59 MB V_VCORE OK 1 30 1 20 1 24 1 36 1 39 IOBD V_ 1V5 OK 1 48 1 27 1 35 1 65 1 72 IOBD V_ 1V8 OK 1 78 1 53 1 62 1 98 2 07 IOBD V_ 3V3MAIN OK 3 38 2 80 2 97 3 63 3 79 IOBD V_ 3V3STBY OK 3 33 2 80 2 97 3 63 3 79 IOBD V_ 1V OK 1 11 0 93 0 99 1 21 1 26 IOBD V_ 1V2 OK 1 17 1 02 1 08 1 32 1 38 IOBD V_ 5V OK 5 09...

Page 43: ...mation about the FRUs in the server Use this command to see information about an individual FRU or for all the FRUs Note By default the output of the showfru command for all FRUs is very long Current sensors Sensor Status IOBD I_USB0 OK IOBD I_USB1 OK FIOBD I_USB OK Power Supplies Supply Status Underspeed Overtemp Overvolt Undervolt Overcurrent PS0 OK OFF OFF OFF OFF OFF PS1 OK OFF OFF OFF OFF OFF...

Page 44: ...R Description ASSY Sun Fire T2000 CPU Board ManR Manufacture Location Sriracha Chonburi Thailand ManR Sun Part No 5016843 ManR Sun Serial No NC00OD ManR Vendor Celestica ManR Initial HW Dash Level 06 ManR Initial HW Rev Level 02 ManR Shortname T2000_MB SpecPartNo 885 0483 04 SEGMENT FL Configured_LevelR Configured_LevelR UNIX_Timestamp32 WED OCT 12 18 24 28 2005 Configured_LevelR Sun_Part_No 54108...

Page 45: ...viously disabled Devices can be manually enabled or disabled using ASR commands see Managing Components with Automatic System Recovery ASR Commands on page 44 Controlling How POST Runs The server can be configured for normal extensive or no POST execution You can also control the level of tests that run the amount of POST output that is displayed and which reset events trigger POST by using ALOM v...

Page 46: ...et Only run POST for the first power on This is the default error_reset Runs POST if fatal errors are detected all_reset Runs POST after any reset diag_verbosity none No POST output is displayed min POST output displays functional tests with a banner and pinwheel normal POST output displays all test and informational messages max POST displays all test informational and some debugging messages All...

Page 47: ...Chapter 2 Sun Fire T2000 Server Diagnostics 33 FIGURE 2 7 Flowchart of ALOM Variables for POST Configuration ...

Page 48: ...gnostic preset values diag_mode normal off service normal setkeyswitch The setkeyswitch parameter when set to diag overrides all the other ALOM POST variables normal normal normal diag diag_level min n a max max diag_trigger power on reset error reset none all resets all resets diag_verbosity normal n a max max Description of POST execution This is the default POST configuration and provides a rea...

Page 49: ...y configured to run POST in minimum mode for all power on or error generated resets This enables the system to initialize quickly and still have hardware checkups to ensure a healthy system Diagnosing the System Hardware You can use POST as an initial diagnostic tool for the system hardware In this case configure POST to run in diagnostic service mode for maximum test coverage and verbose output T...

Page 50: ...key to abort SC Alert SC Request to Power Off Host SC Alert Host system has shut down Powering host on at MON JAN 10 02 52 13 2000 SC Alert SC Request to Power On Host sc console SC Alert Host System has Reset Note some output omitted 0 0 0 0 Copyright 2005 Sun Microsystems Inc All rights reserved SUN PROPRIETARY CONFIDENTIAL Use is subject to license terms 0 0 VBSC selecting POST MAX Testing 0 0 ...

Page 51: ...L2 Scrub Tags 0 0 Test Memory 0 0 Scrub 00000000 00600000 00000001 00000000 on Memory Channel 0 1 2 3 Rank 0 Stack 0 0 0 Scrub 00000001 00000000 00000002 00000000 on Memory Channel 0 1 2 3 Rank 1 Stack 0 3 0 IMMU Functional 7 0 IMMU Functional 7 0 DMMU Functional 0 0 IMMU Functional 0 0 DMMU Functional 0 0 Print Mem Config 0 0 Caches Icache is ON Dcache is ON 0 0 Bank 0 4096MB 00000000 00000000 00...

Page 52: ...tax c the core number s the strand number Warning and informational messages use the following syntax INFO or WARNING message 0 0 IO Bridge Quick Read Only of CSR and ID 0 0 0 0 fire 1 JBUSID 00000080 0f000000 0 0 IO Bridge unit 1 Config MB bridges 0 0 Config port A bus 2 dev 0 func 0 tag IOBD PCI SWITCH0 0 0 Config port A bus 3 dev 1 func 0 tag IOBD GBE0 0 0 INFO Master Abort for probe device IOB...

Page 53: ...mmands to display and control disabled components See Managing Components with Automatic System Recovery ASR Commands on page 44 7 2 7 2 ERROR TEST Data Bitwalk 7 2 H W under test MB CMP0 CH2 R0 D0 S0 MB CMP0 CH2 R0 D0 7 2 Repair Instructions Replace items in order listed by H W under test above 7 2 MSG Pin 149 failed on MB CMP0 CH2 R0 D0 J1601 7 2 END_ERROR 7 2 Decode of Dram Error Log Reg Channe...

Page 54: ...uishes the problem across any set of systems When possible the fault manager daemon initiates steps to self heal the failed component and take the component offline The daemon also logs the fault to the syslogd daemon and provides a fault notification with a message ID MSGID You can use message ID to get additional information about the problem from Sun s knowledge article database The predictive ...

Page 55: ...e fmdump command with v for verbose output In this example a fault is displayed indicating the following details Date and time of the fault Apr 24 06 54 08 2005 Universal Unique Identifier UUID that is unique for every fault lce22523 lc80 6062 e61d f3b39290ae2c Sun message identifier SUNW4V 8000 6H that can be used to obtain additional fault information Faulted FRU FRU hc component MB that in this...

Page 56: ...le levels Automated Response The fault manager will attempt to remove the affected CPU from service Impact System performance may be affected Suggested Action for System Administrator Schedule a repair procedure to replace the affected CPU the identity of which can be determined using fmdump v u EVENT_ID Details The Message ID SUN4U 8000 6H indicates diagnosis has determined that a CPU is faulty T...

Page 57: ...ts of the var adm messages file To Check the Message Buffer 1 Log in as superuser 2 Issue the dmesg command The dmesg command displays the most recent messages generated by the system To View System Message Log Files The error logging daemon syslogd automatically records various system warnings errors and faults in message files These messages can alert you to system problems such as a device that...

Page 58: ... O bus The database that contains the list of disabled components is called the ASR blacklist asr db In most cases POST and ALOM automatically manage the disabling of faulty comments and automatically enables them when the faulty FRU is replaced In some situations it is necessary to manually manage the blacklist Example A component appears faulty and is automatically disabled The problem is due to...

Page 59: ...cycle is required after disabling or enabling a component If the status of a component is changed with power on there is no effect to the system until the next reset or powercycle TABLE 2 10 ASR Commands Command Description showcomponent The showcomponent command may not report all blacklisted DIMMS Displays system components and their current state enablecomponent asrkey Removes a component from ...

Page 60: ... MB CMP0 P13 MB CMP0 P14 MB CMP0 P15 MB CMP0 P16 MB CMP0 P17 MB CMP0 P18 MB CMP0 P19 MB CMP0 P20 MB CMP0 P21 MB CMP0 P22 MB CMP0 P23 MB CMP0 P28 MB CMP0 P29 MB CMP0 P30 MB CMP0 P31 MB CMP0 CH0 R0 D0 MB CMP0 CH0 R0 D1 MB CMP0 CH0 R1 D0 MB CMP0 CH0 R1 D1 MB CMP0 CH1 R0 D0 MB CMP0 CH1 R0 D1 MB CMP0 CH1 R1 D0 MB CMP0 CH1 R1 D1 MB CMP0 CH2 R0 D0 MB CMP0 CH2 R0 D1 MB CMP0 CH2 R1 D0 MB CMP0 CH2 R1 D1 MB ...

Page 61: ... that the ASR command takes effect To Run the enablecomponent Command The enablecomponent command enables a disabled component by removing it from the ASR blacklist 1 At the sc prompt enter the enablecomponent command 2 After receiving confirmation that the enablecomponent command is complete reset the server for so that the ASR command takes effect sc disablecomponent MB CMP0 CH3 R1 D1 sc SC Aler...

Page 62: ... Checking Whether SunVTS Software Is Installed on page 48 Exercising the System Using SunVTS Software on page 50 Checking Whether SunVTS Software Is Installed This procedure assumes that the Solaris OS is running on the Sun Fire T2000 server and that you have access to the Solaris command line To Check Whether SunVTS Software Is Installed 1 Check for the presence of SunVTS packages using the pkgin...

Page 63: ...erating System DVDs From the Sun Download Center http www sun com oem products vts The SunVTS 6 0 PS3 software and future compatible versions are supported on the Sun Fire T2000 server SunVTS installation instructions are described in the SunVTS User s Guide Package Description SUNWvts SunVTS framework SUNWvtsr SunVTS Framework root SUNWvtsts SunVTS for tests SUNWvtsmn SunVTS man pages ...

Page 64: ...er s Guide SunVTS software can be run in several modes This procedure assumes that you are using the default mode This procedure also assumes that the Sun Fire T2000 server is headless that is it is not equipped with a monitor capable of displaying bit mapped graphics In this case you access the SunVTS GUI by logging in remotely from a machine that has a graphics display Finally this procedure des...

Page 65: ...e If you have installed SunVTS software in a location other than the default opt directory alter the path in the preceding command accordingly where display system is the name of the machine through which you are remotely logged in to the Sun Fire T2000 server The SunVTS GUI is displayed FIGURE 2 8 FIGURE 2 8 SunVTS GUI opt SUNWvts bin sunvts display display system 0 ...

Page 66: ...r test category name Tests are enabled when checked and disabled when not checked TABLE 2 11 lists tests that are especially useful to run on a Sun Fire T2000 server TABLE 2 11 Useful SunVTS Tests to Run on a Sun Fire T2000 Server SunVTS Tests FRUs Exercised by Tests cmttest cputest fputest iutest l1dcachetest dtlbtest and l2sramtest indirectly mptest and systest memory DIMMS CPU motherboard diskt...

Page 67: ...s all status and error messages To view these click the Log button or select Log Files from the Reports menu This opens a log window from which you can choose to view the following logs Information Detailed versions of all the status and error messages that appear in the test messages area Test Error Detailed error messages from individual tests VTS Kernel Error Error messages pertaining to SunVTS...

Page 68: ...54 Sun Fire T2000 Server Service Manual October 2005 ...

Page 69: ...ppable and hot pluggable field replaceable units FRUs in the Sun Fire T2000 Server The following topics are covered Devices That Are Hot Swappable and Hot Pluggable on page 56 Hot Swapping a Fan on page 56 Hot Swapping a Power Supply on page 58 Hot Swapping the Rear Blower on page 61 Hot Plugging a Hard Drive on page 63 ...

Page 70: ...In a Sun Fire T2000 server the chassis mounted hard drives can be hot swappable depending on how they are configured Hot Swapping a Fan Three hot swappable fans are located under the fan door Two working fans are required to provide adequate cooling for the Sun Fire T2000 server If a fan fails replace it as soon as possible to ensure system availability The following LEDs are lit when a fan fault ...

Page 71: ... Server to the Maintenance Position on page 69 FIGURE 3 1 Removing a Fan 2 Unpackage the replacement fan and place it near the server 3 Lift the latch on the top of the fan door Removing a Fan on page 57 and lift the fan door open The fan door is spring loaded and you must hold it in the open position 4 Identify the faulty fan A lighted LED on the top of a fan FIGURE 3 1 indicates that the fan is ...

Page 72: ...appable power supplies enable you to remove and replace a power supply without shutting the server down provided that the other power supply is online and working The following LEDs are lit when a power supply fault is detected Front and rear Service Required LEDs Rear FRU Fault LED on the front of the server Amber Failure LED on the faulty power supply If a power supply fails and you do not have ...

Page 73: ... operation For instructions on how to access the sc prompt refer to the Sun Fire T2000 Server Advanced Lights Out Manager ALOM Guide Example Where PSn is the power supply identifier for the power supply you plan to remove either PS0 or PS1 3 Gain access to the rear of the server where the faulty power supply is located sc removefru y PSn Are you sure you want to remove PS0 y n y PS0 is safe to rem...

Page 74: ...supply latch to the right 7 Pull the power supply out of the chassis To Replace a Power Supply 1 Align the replacement power supply with the empty power supply bay 2 Slide the power supply into bay until it is fully seated 3 Reconnect the power cord to the power supply 4 Close the CMA inserting the end of the CMA into the rear left rail bracket 5 Verify that the amber LED on the replaced power sup...

Page 75: ...nit is located 2 Release cable management arm tab FIGURE 3 3 and swing the cable management arm out of the way so you can access the power supply 3 Unscrew the two thumbscrews FIGURE 3 4 that secure the rear blower to the chassis FIGURE 3 4 Removing the Rear Blower 4 Grasp the thumbscrews and slowly slide the blower out of the chassis keeping the blower level as you remove it To Replace the Rear B...

Page 76: ...FIGURE 3 5 Replacing the Blower Unit 3 Tighten the two thumbscrews to secure the blower FN2 to the chassis 4 Verify that the Rear Blower and Service Required LEDs are not lit 5 Close the CMA inserting the end of the CMA into the rear left rail bracket FN2 ...

Page 77: ...owing situations inhibit the ability to perform hot plugging of a drive The hard drive provides the operating system and the operating system is not mirrored on another drive The hard drive cannot be logically isolated from the online operations of the server If your drive falls into these conditions you must shut the system down before you replace the hard drive See To Shut the System Down on pag...

Page 78: ...slot To Replace a Hard Drive 1 Align the replacement drive to the drive slot The hard drive is physically addressed according to the slot in which it is installed See FIGURE 3 6 It is important to install a replacement drive in the same slot as the drive that was removed 2 Slide the drive into the bay until it is fully seated 3 Close the latch to lock the drive in place 4 Perform administrative ta...

Page 79: ...ormation on page 66 Common Procedures for Parts Replacement on page 67 Removing and Replacing FRUs on page 74 Common Procedures for Finishing Up on page 103 For a list of FRUs see Appendix A Field Replaceable Units on page 119 Note Never attempt to run the system with the cover removed The cover must be in place for proper air flow The cover interlock switch immediately shuts the system down when ...

Page 80: ...ke sure that the voltage and frequency of your power source match the voltage and frequency inscribed on the equipment s electrical rating label Follow the electrostatic discharge safety practices as described in this section Safety Symbols The following symbols might appear in this book note their meanings Caution There is a risk of personal injury and equipment damage To avoid personal injury an...

Page 81: ...r removing server components attach an antistatic strap to your wrist and then to a metal area on the chassis Do this after you disconnect the power cords from the server Following this practice equalizes the electrical potentials between you and the server Use an Antistatic Mat Place ESD sensitive components such as the motherboard memory and other PCB cards on an antistatic mat Common Procedures...

Page 82: ...t 1 Log in as superuser or equivalent Depending on the nature of the problem you might want to view the system status the log files or run diagnostics before you shut down the system Refer to the Sun Fire T2000 Server Administration Guide for log file information 2 Notify affected users Refer to your Solaris system administration documentation for additional information 3 Save any open files and q...

Page 83: ...nance position Note Removing the server from the rack is recommended for all cold swappable FRU replacement procedures except the DIMMs PCI cards and the system controller 1 Optional Issue the following command from the ALOM sc prompt to locate the system that requires maintenance Once you have located the server press the Locator LED button to turn it off 2 Check to see that no cables will be dam...

Page 84: ... Caution The server weighs approximately 40 lb 18 kg Two people are required to dismount and carry the chassis 1 Disconnect all the cables and power cords from the server 2 Extend the server to the maintenance position as described in To Extend the Server to the Maintenance Position on page 69 3 Press the metal lever FIGURE 4 2 that is located on the inner side of the rail to disconnect the CMA fr...

Page 85: ...ly 40 lb 18 kg The next step requires two people to dismount and carry the chassis 4 From the front of the server pull the release tabs forward and pull the server forward until it is free of the rack rails The release tabs are located on each rail about midway on the server 5 Set the server on a sturdy work surface ...

Page 86: ...d installation Place ESD sensitive components such as the printed circuit boards on an antistatic mat The following items can be used as an antistatic mat Antistatic bag used to wrap a Sun replacement part Sun ESD mat part number 250 1088 Disposable ESD mat shipped with some replacement parts or optional system components 2 Attach an Antistatic Wrist Strap When servicing or removing server compone...

Page 87: ...zel and Top Front Cover The following field replaceable units FRUs require the removal of the top front cover and front bezel Motherboard SAS disk backplane LED board Front I O board Fan power board DVD 1 Remove the top cover as described in the previous procedure 2 Lift the fan cover latch FIGURE 4 3 and open the fan cover 3 Loosen the captive screw near the right most fan that secures the bezel ...

Page 88: ...from the chassis Removing and Replacing FRUs This section provides procedures for replacing the following field replaceable parts FRUs inside the server chassis To Remove PCI E and PCI X Cards on page 75 and To Replace PCI Cards on page 77 To Remove DIMMs on page 77 and To Replace DIMMs on page 79 To Remove the System Controller on page 82 and To Replace the System Controller Board on page 83 To R...

Page 89: ...1 To locate these FRUs refer to Appendix A Field Replaceable Units on page 119 To Remove PCI E and PCI X Cards Use this procedure to remove the optional PCI E and PCI X cards from the server 1 Perform the procedures described in Common Procedures for Parts Replacement on page 67 2 Locate the PCI card that you want to remove To locate the PCI card slots refer to FIGURE 4 5 and FIGURE 4 6 The PCI ca...

Page 90: ... E and PCI X Card Slots 4 Make note of and remove any cables that are attached to the card 5 Rotate the PCI hold down bracket 90 degrees so it no longer covers the PCI card FIGURE 4 7 FIGURE 4 7 PCI Card and Hold down Bracket PCI E slots 0 1 2 PCI X slots 0 1 PCI hold down bracket ...

Page 91: ...the PCI hold down bracket 90 degrees to lock the card in place 6 Perform the procedures described in Common Procedures for Finishing Up on page 103 To Remove DIMMs Caution This procedure requires that you handle components that are sensitive to static discharges that can cause the component to fail To avoid this problem ensure that you follow antistatic practices as described in To Perform Electro...

Page 92: ...mes that are displayed in faults to socket numbers that identify the location of the DIMM on the motherboard TABLE 4 1 DIMM Names and Socket Numbers DIMM Name Used in Messages Socket No DIMM No CH0 R1 D1 J0901 DIMM 1 CH0 R0 D1 J0701 DIMM 2 CH0 R1 D0 J0801 DIMM 3 CH0 R0 D0 J0601 DIMM 4 CH1 R0 D1 J1401 DIMM 5 Front of board ...

Page 93: ... in the connector This ensures that the DIMM is oriented correctly 4 Push the DIMM into the connector until the ejector tabs lock the DIMM in place 5 Perform the procedures described in Common Procedures for Finishing Up on page 103 CH1 R1 D1 J1201 DIMM 6 CH1 R1 D0 J1301 DIMM 7 CH1 R0 D0 J1101 DIMM 8 CH2 R1 D1 J1901 DIMM 16 CH2 R0 D1 J1701 DIMM 15 CH2 R1 D0 J1801 DIMM 14 CH2 R0 D0 J1601 DIMM 13 CH...

Page 94: ...lted in the DIMM being disabled such as the following Run the enablecomponent command to enable the FRU 7 Perform the following steps to verify that there are no faults a Set the virtual keyswitch to diag mode so that POST will run in service mode sc showfaults v ID Time FRU Fault 0 SEP 09 11 09 26 MB CMP0 CH0 R0 D0 Host detected fault MSGID SUN4U 8000 2S UUID 7ee0e46b ea64 6565 e684 e996963f7b86 ...

Page 95: ...ed faults or not the system might boot or the system might remain at the ok prompt If the system is at the ok prompt type boot d Issue the Solaris OS fmadm faulty command No memory or DIMM faults should be displayed If faults are reported return to the Diagnostic Flow Chart on page 13 for an approach to diagnosing the fault sc poweron sc console 0 0 POST Passed all devices 0 0 0 0 DEMON Diagnostic...

Page 96: ...oving the System Controller Card 4 Grasp the top corners of the card and pull it out of the socket 5 Place the system controller card on an antistatic mat 6 Remove the system configuration PROM FIGURE 4 10 from the system controller and place it on an antistatic mat The system controller contains the persistent storage for the host ID and Ethernet MAC addresses of the system as well as the ALOM co...

Page 97: ...ly 4 Ensure that the ejector levers are open 5 Holding the bottom edge of the system controller parallel to its socket carefully align the system controller so that each of its contacts is centered on a socket pin Ensure that the system controller is correctly oriented A notch along the bottom of the system controller corresponds to a tab on the socket 6 Push firmly and evenly on both ends of the ...

Page 98: ...onents that are sensitive to static discharges that can cause the component to fail To avoid this problem ensure that you follow antistatic practices as described in To Perform Electrostatic Discharge ESD Prevention Measures on page 72 FIGURE 4 11 Motherboard Assembly 1 Perform the procedures described in Common Procedures for Parts Replacement on page 67 2 Remove all cables from the rear of the s...

Page 99: ...RE 4 12 Disconnect the hard drive data cables and carefully pull them through the interior wall of the chassis The SAS hard drive and the cable marked P8 pass through a cut out in the interior wall of the chassis Before removing the motherboard assembly by lifting it over the interior wall ensure that these cables are out of the way The SAS hard drive cables can readily be folded back over the int...

Page 100: ...he flexible cable in place These screws must be installed at the factory and they must not be removed FIGURE 4 13 Location of the Screws in the Motherboard Assembly 8 Slide the motherboard assembly forward to disengage the connectors at the rear of the motherboard assembly from the cutouts in the rear of the chassis Flexible cable do not remove flex cable screws 1 2 3 4 5 6 7 8 9 10 Bus bar screws...

Page 101: ...s wall and lift it out of the chassis FIGURE 4 14 Caution Do not lift the motherboard assembly over the front fan housing to remove it from the chassis because doing so can damage the assembly FIGURE 4 14 Removing the Motherboard Assembly from the Server Chassis 10 Place the motherboard assembly on an antistatic mat ...

Page 102: ...the component to fail To avoid this problem ensure that you follow antistatic practices as described in To Perform Electrostatic Discharge ESD Prevention Measures on page 72 1 Unpackage the replacement motherboard assembly and place it on an antistatic mat 2 Tilt the motherboard assembly over the interior wall into the chassis FIGURE 4 15 by reversing the procedure you used to remove the assembly ...

Page 103: ...d washers FIGURE 4 16 Do not fully tighten these screws until all screws are loosely installed 7 Tighten the two bus bar screws to secure the bus bar to the motherboard assembly 8 Reinstall the system controller board in the motherboard assembly See To Replace the System Controller Board on page 83 9 Reinstall all DIMMs in the motherboard assembly in the slots from which they were removed See To R...

Page 104: ...Procedures for Finishing Up on page 103 To Remove the Power Distribution Board 1 Perform the procedures described in Common Procedures for Parts Replacement on page 67 Caution The system supplies power to the power distribution board even when the system is powered off To avoid personal injury or damage to the system you must disconnect power cords before servicing the power distribution board 2 D...

Page 105: ... latches on the DVD cable and disconnect it Disconnect the cable marked P7 Disconnect the blower power cable from the power distribution board 4 Remove the two screws that secure the power distribution board to the bus bar FIGURE 4 18 FIGURE 4 18 Location of Bus Bar Screws on the Power Distribution Board and the Motherboard Assembly Bus bar screws Power distribution board mounting screw ...

Page 106: ...ibution Board Caution The system supplies power to the power distribution board even when the system is powered off To avoid personal injury or damage to the system you must disconnect all power cords before servicing the power distribution board 1 Loosely fit the power distribution board onto the locator pins in the chassis and slide the board toward rear of the chassis 2 Secure the power distrib...

Page 107: ...escribed in Common Procedures for Finishing Up on page 103 Note After replacing the power distribution board and powering on the system you must run the setcsn command on the ALOM console to set the electronically readable chassis serial number For details refer to the Sun Fire T2000 Server Lights Out Management ALOM Guide To Remove the LED Board 1 Perform the procedures described in Common Proced...

Page 108: ...ont I O board 5 Remove the LED board from the chassis and place it on an antistatic mat To Replace the LED Board 1 Install the LED board in the chassis 2 Slide the board to the left to connect it to the front I O board 3 Secure the LED board to the chassis using two M3x6 flat head screws FIGURE 4 21 4 Replace all three fans See To Replace a Fan on page 58 5 Perform the procedures described in Comm...

Page 109: ...age 67 2 Remove all three fans See To Remove a Fan on page 57 3 Remove the screw that secures the fan power board to the chassis FIGURE 4 22 4 Slide the fan power board to the right to disengage it from the front I O board 5 Remove the fan power board from the front fan bay and place the board on an antistatic mat FIGURE 4 22 Removing the Fan Power Board ...

Page 110: ...dures described in Common Procedures for Finishing Up on page 103 To Remove the Front I O Board 1 Perform the procedures described in Common Procedures for Parts Replacement on page 67 2 Remove all three fans See To Remove a Fan on page 57 3 Disengage the fan power board vrom the front I O board see Step 3 and Step 4 in To Remove the Fan Power Board on page 95 4 Remove the fan guard to gain access...

Page 111: ...assis FIGURE 4 24 FIGURE 4 24 Removing the Front I O Board 9 Place the front I O board on an antistatic mat To Replace the Front I O Board 1 Unpackage the front I O board and place it on an antistatic mat 2 Tip the front I O board downwards and slightly forward and push it into place aligning the board with the screw hole in the exterior wall of the chassis When the board is fully seated both conn...

Page 112: ...58 8 Perform the procedures described in Common Procedures for Finishing Up on page 103 To Remove the DVD Drive 1 Perform the procedures described in Common Procedures for Parts Replacement on page 67 2 Release the spring latch that secures the DVD in place in the chassis FIGURE 4 25 FIGURE 4 25 Removing the DVD Drive 3 Push the DVD drive out of the chassis from the rear and remove it from the cha...

Page 113: ...the chassis See To Remove the DVD Drive on page 98 3 Remove all hard drives from the chassis See To Remove a Hard Drive on page 63 Note the slot in which each drive belongs 4 Disconnect the SAS power cable from the power cable plug 5 Make a note of which SAS data cable is plugged into which slot and disconnect the four SAS data cables from the SAS disk backplane 6 Remove the five screws that secur...

Page 114: ... assembly with five M4x8 pan head screws FIGURE 4 27 Do not tighten the screws until all five of them are in place 4 Connect the SAS power cable from the power cable connector 5 Connect the four SAS data cables to the replacement SAS disk backplane making sure that you connect the cables in the same positions on the replacement SAS disk backplane 6 Reinstall all four hard disk drives in the slots ...

Page 115: ...g a small flat head screwdriver carefully pry the battery FIGURE 4 28 from the system controller FIGURE 4 28 Removing the Battery From the System Controller To Replace the Battery on the System Controller 1 Unpackage the replacement battery 2 Press the new battery into the system controller FIGURE 4 29 with the positive side facing upward away from the card FIGURE 4 29 Replacing the Battery in the...

Page 116: ...escribed in Common Procedures for Finishing Up on page 103 5 Use the ALOM setdate command to set the day and time Use the setdate command before you power on the host system For details about this command refer to the Sun Fire T2000 Server Advanced Lights Out Management ALOM Guide ...

Page 117: ...hassis 2 Being careful not to catch the cover on the intrusion switch slide the front top cover forward until it snaps into place FIGURE 4 30 FIGURE 4 30 Replacing the Top Front Cover 3 Position the bezel on the front of the chassis and snap it into place 4 Open the fan door 5 Tighten the captive screw to secure the front bezel to the chassis Intrusion switch ...

Page 118: ...until it latches into place To Reinstall Server Chassis in the Rack If you removed the server chassis from the rack perform these steps Caution The server weighs approximately 40 lb 18 kg Two people are required to carry the chassis and install it in the rack 1 On the rack ensure that the rails are extended 2 Place the ends of the chassis mounting brackets into the slide rails Returning the Server...

Page 119: ...ace The server is now in the extended maintenace position To Return the Server to the Normal Rack Position If you extended the server to the maintenance position use this procedure to return the server to the normal rack position 1 Release the slide rails from the fully extended position by pushing the release levers on the side of each rail FIGURE 4 32 ...

Page 120: ...ush the server into the rack Make sure the cables do not get in the way 3 Reconnect the CMA into the back of the rail assembly Note Refer to the Sun Fire T2000 Server Installation Guide for detailed CMA installation instructions a Insert the smaller extension into the clip located at the end of the mounting bracket FIGURE 4 33 ...

Page 121: ...l extension will click into place 4 Reconnect the cables to the back of the server If the CMA is in the way disconnect the left CMA release and swing the CMA open To Apply Power to the Server Reconnect both power cords to the power supplies Note As soon as the power cords are connected standby power is applied and depending on the configuration of the firmware the system might boot ...

Page 122: ...108 Sun Fire T2000 Server Service Manual October 2005 ...

Page 123: ...d Devices This chapter describes how to add new components and devices to the Sun Fire T2000 server The following topics are covered Adding Hot Pluggable and Hot Swappable Devices on page 110 Adding Components Inside the Chassis on page 113 ...

Page 124: ... Server Hard drives are physically addressed according to the slot in which installed 1 Remove the blank panel from the chassis a On the blank panel push the latch release button b Grasp the latch and pull the blank panel out 2 Align the disk drive to the drive bay slot See FIGURE 5 1 For additional details see To Remove a Hard Drive on page 63 3 Slide the hard drive into the bay until it is fully...

Page 125: ...You can connect up to 126 devices to each of the two USB controllers each controller provides two connectors for a total of 252 USB devices The USB ports on the server support USB 1 1 devices Note There are many USB devices on the market Read the product documentation for your USB device for additional installation requirements and instructions that are not covered here Plug a standard USB device ...

Page 126: ...112 Sun Fire T2000 Server Service Manual October 2005 FIGURE 5 2 Adding a USB Device Front USB ports Rear USB ports ...

Page 127: ... 5 1 to plan the memory configuration of your server There are 16 slots that hold industry standard DDR 2 memory DIMMs providing a total of 32 GB of memory The Sun Fire T2000 server accepts the following DIMM sizes 512 MB 1 GB 2GB Sun Fire T2000 supports two ranks of eight DIMMs each At minimum rank 0 must be fully populated with eight DIMMS of the same capacity DIMMs can be added eight at a time ...

Page 128: ...114 Sun Fire T2000 Server Service Manual October 2005 FIGURE 5 3 Memory DIMM Layout Front of board ...

Page 129: ... the connector until the ejector tabs lock the DIMM in place 6 Repeat Step 3 through Step 5 for each additional DIMM 7 Perform the procedures described in Common Procedures for Parts Replacement on page 67 TABLE 5 1 DIMM Names and Socket Numbers DIMM Name Socket Number Rank 0 DIMMs CH0 R0 D0 J0601 CH0 R0 D1 J0701 CH1 R0 D1 J1401 CH1 R0 D0 J1101 CH2 R0 D1 J1701 CH2 R0 D0 J1601 CH3 R0 D1 J2201 CH3 R...

Page 130: ...ct documentation for your device for additional installation requirements and instructions that are not covered here FIGURE 5 4 Location of PCI E and PCI X Card Slots 1 Perform all of the procedures in Common Procedures for Parts Replacement on page 67 2 Rotate the PCI hold down bracket located on the edge of the chassis 90 degrees so that the chassis edge can accept the card You might need to loo...

Page 131: ...evices 117 5 Rotate the PCI hold down brackets to the closed position and secure the screw on the bracket 6 Install any cables that go to the PCI card 7 Perform the procedures described in Common Procedures for Parts Replacement on page 67 ...

Page 132: ...118 Sun Fire T2000 Server Service Manual October 2005 ...

Page 133: ...119 APPENDIX A Field Replaceable Units FIGURE A 1 FIGURE A 2 and TABLE A 1 list the locations of the field replaceable units FRUS in the Sun Fire T2000 server ...

Page 134: ...120 Sun Fire T2000 Server Service Manual October 2005 FIGURE A 1 Field Replaceable Units Part 1 1 5 2 6 8 9 3 4 7 ...

Page 135: ...Appendix A Field Replaceable Units 121 FIGURE A 2 Field Replaceable Units Part 2 13 11 12 10 14 16 15 ...

Page 136: ...move the System Controller on page 82 This board implements the system controller subsystem The SC board contains a PowerPC Extended Core and a communications processor that controls the host power and monitors host system events power and environmental It holds a socketed EEPROM for storing the system configuration all Ethernet MAC addresses and the host ID This board only draws power from the 3 ...

Page 137: ... and LEDs that are displayed on the front bezel of the box LEDBD 11 Front I O board To Remove the Front I O Board on page 96 Front I O board FIOBD 12 Fan power board To Remove the Fan Power Board on page 95 Houses the connectors and 3 amber LEDs for the Fan assemblies FANBD 13 Fans To Remove a Fan on page 57 Fans 0 1 and 2 FN0 FN1 FN2 14 SAS disk backplane To Remove the SAS Disk Backplane on page ...

Page 138: ...DD2 HDD3 16 DVD drive To Remove the DVD Drive on page 98 DVD ROM CD drive DVD Not shown PCI E and PCI X cards To Remove PCI E and PCI X Cards on page 75 Optional add on cards PCIE0 PCIE1 PCIE2 PCIX0 PCIX1 1 The FRU name is used in system messages TABLE A 1 Sun Fire T2000 Server FRU List Continued Item No FRU Replacement Instructions Description FRU Name1 ...

Reviews: