background image

50

007-5841-001

6: Basic Troubleshooting and Chassis Service

Accessing the Inside of the Chassis

1.

Grasp the two handles on either side and pull the unit straight out until it locks (you will hear 
a “click”). 

2.

Next, depress the two buttons on the top of the chassis to release the top cover and at the 

same time, push the cover away from you until it stops. You can then lift the top cover from 
the chassis to gain full access to the inside of the server.

Note:  

Normally you would power down the system before installing or removing internal 

components - but it may be necessary to leave system power on to determine which fan has failed. 

System Fans

Five 8-cm hot-swap fans provide the cooling for the system. It is very important that the chassis 
top cover is properly installed and making a good seal in order for the cooling air to circulate 
properly through the chassis and cool the components. 

System Fan Failure

Fan speed is controlled by system temperature via a BIOS setting. If a fan fails, the remaining fans 
will ramp up to full speed and the overheat/fan fail LED on the control panel will flash. Replace 
any failed fan as soon as possible with the same type and model (the system can continue to run 
with a failed fan). Remove the top chassis cover while the system is still running to determine 
which of the fans has failed. After determining which is the failed fan, remove power from the 
system by unplugging the server’s cord.

Replacing System Fans

This section describes how to remove or install a system fan.

Summary of Contents for Rackable C2110G-RP5

Page 1: ...SGI Rackable C2110G RP5 System User Guide 007 5841 001 ...

Page 2: ...ATTRIBUTIONS SGI and the SGI logo are registered trademarks and Rackable is a trademark of Silicon Graphics International in the United States and or other countries worldwide Intel Intel QuickPath Interconnect QPI and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries Fusion MPT Integrated RAID MegaRAID and LSI Logic are ...

Page 3: ...007 5841 001 iii Record of Revision Version Description 001 May 2012 First release ...

Page 4: ......

Page 5: ...omputer systems This guide may be useful to installers and system administrators looking for overview information on the server Chapter Descriptions The following topics are covered in this guide Chapter 1 Introduction Provides an overview of the server s components Chapter 2 Server Installation Provides a quick setup checklist to get the server operational Chapter 3 System Interface Describes sev...

Page 6: ...oting and Chassis Service Describes some basic steps required to troubleshoot your system Additional sections in this chapter are intended to guide you through basic component remove and replace procedures Appendix A BIOS Error Codes Provides a brief listing of BIOS error code information Appendix B System Specifications Describes system component environmental specifications and compliance ...

Page 7: ... comprehensive set of online books release notes man pages and other information You can also view man pages by typing man title on a command line SGI systems include a set of Linux man pages formatted in the standard UNIX man page style Important system configuration files and commands are documented on man pages These are found online on the internal system disk or DVD CD and are displayed using...

Page 8: ...pace font denotes literal items such as commands files routines path names signals messages and programming language structures variable The italic typeface denotes variable entries and words or concepts being defined Italic typeface is also used for book titles user input This bold fixed space font denotes literal items that the user enters in interactive sessions Output is shown in nonbold fixed...

Page 9: ...tter of the manual In printed manuals the document number is located at the bottom of each page You can contact SGI in any of the following ways Send e mail to the following address techpubs sgi com Contact your customer service representative and ask that an incident be filed in the SGI incident tracking system Send mail to the following address SGI Technical Publications 46600 Landing Parkway Fr...

Page 10: ......

Page 11: ...4 IPMI 4 Other Features 4 Server Chassis Features 5 System Power 5 Serial ATA Subsystems 5 Front Control Panel 5 Serverboard and GPU Subsystem 5 GPU Features 6 Cooling System 6 2 Server Installation 9 Unpack the System 9 Prepare for Setup 9 Choose a Setup Location 9 System Warnings and Precautions 10 Server Precautions 11 Rack Mounting Considerations 11 Ambient Operating Temperature 11 Reduced Air...

Page 12: ...er to the System 18 3 System Interface 19 Overview 19 Control Panel Buttons 20 Control Panel LEDs 21 Power Fail LED 21 Overheat Fan Fail UID LED 21 NIC1 22 NIC2 22 HDD 22 Power 23 Drive Carrier LEDs 23 4 System Safety 25 Electrical Safety Precautions 25 Serverboard Battery 26 ESD Precautions 26 Mainboard Replaceable Soldered in Fuses 27 General Safety Precautions 27 5 System and Serverboard Inform...

Page 13: ... 39 Basic Troubleshooting Procedures 39 If the System Does Not Power Up 39 System Powers Up But Will Not Boot 40 No Video After System Power Up 40 Memory Errors 40 Chassis Service Information 41 Static Sensitive Devices 41 Precautions 41 Unpacking 42 Control Panel 42 Drive Bay Installation Removal 42 Accessing the Drive Bays 42 Removing Hard Drives or Carriers from the Chassis 43 The Hard Drive Ba...

Page 14: ...ng System Fans 50 Remove Replace a Fan 51 Mid chassis Fans 53 Front and GPU Fans 54 Install Replace a PCIe Expansion Card 56 Install Replace a Full height PCIe Card 56 Install Replace a Low profile PCIe Card 59 A BIOS Error Codes 61 B System Operating and Regulatory Overview 63 Operating Environment 63 System Input Requirements 63 Power Supply 64 Regulatory Compliance 64 ...

Page 15: ...dition to the serverboard and chassis various hardware components may be included with the system as listed Five 8 cm chassis fans One internal air shroud One passive 1U CPU heatsink and one passive 2U CPU heatsink Riser cards as follows One riser for two PCIe x16 cards left front side internal GPU cards One riser for two PCIe x16 cards right front side internal GPU cards One riser for one PCIe x4...

Page 16: ...D drive Check with your SGI sales or service representative for information on optional external CD DVD drive units Figure 1 1 Rackable C2110G RP5 Server Front and Rear Views PCI expansion slots 2 1 System reset System LEDs Main power Ten disk drive bays VGA port Ethernet ports USB ports IPMI LAN ...

Page 17: ... total number of pins is 84 The 20 data lanes are divided onto four quadrants of 5 lanes each The basic unit of transfer is the 80 bit flit which is transferred in two clock cycles four 20 bit transfers two per clock The 80 bit flit has 8 bits for error detection 8 bits for link layer header and 64 bits for data QPI bandwidths are advertised by computing the transfer of 64 bits 8 bytes of data eve...

Page 18: ...ts A dedicated external IPMI LAN port is also included Onboard Graphics Controller The dual processor serverboard features an integrated Matrox G200 video controller providing a 16MB DDR2 graphics interface through the system VGA connector The Matrox video controller in the 2U server features low power consumption high reliability and superior longevity IPMI IPMI Intelligent Platform Management In...

Page 19: ... to enable the hot swap capability of RAID drives Certain RAID levels require use of optional hardware to support RAIDed hard disk drives in the server Front Control Panel The control panel on the C2110G RP5 server provides you with system monitoring and control LEDs that indicate system power HDD activity network activity system overheat and a system overheat fan fail UID LED A main power button ...

Page 20: ...hances performance and reduces data transfers by keeping larger data sets in local memory attached directly to the GPU Integrates the GPU subsystem with the C2110G RP5 server s monitoring and management capabilities such as IPMI Onboard L1 and L2 caches that accelerate algorithms and sparse matrix multiplication Provides faster context switching concurrent kernel execution and improved thread bloc...

Page 21: ...P1 P0 P0 0 4 0 3 0 2 0 1 1 4 1 3 1 2 1 1 QPI 8G LANE6 SLOT 1 SLOT 5 SLOT 6 E5 2600 Series PCH C602 C604 PCI E X16 G3 DMI2 LANE5 LANE1 2 3 4 SPI SIO W83527 DMI2 BMC WPCM450 PCI E X16 PCI E X16 G3 PCI SATA3 E5 2600 Series 8 SNB CORE DDR III 8 SNB CORE DDR III DMI2 DMI2 2 3 1 1 2 3 QPI 8G 4GB s PCI E X16 VGA PCI E X8 G3 PCI E X16 PCI E X16 SAS 6 9 0 5 0 1 PCI E X8 G3 PCI E X16 G3 PCI E X16 G3 COM Por...

Page 22: ......

Page 23: ...agnetic fields are generated Place the server rack near a grounded power outlet Refer also to System Warnings and Precautions on page 10 Prepare for Setup The shipping container should include two sets of rail assemblies two rail mounting brackets and the mounting screws that you will use to install the system into a rack Read this section in its entirety before you begin the installation procedur...

Page 24: ...th the full weight of the rack resting on them Failure to do so can result in serious injury or death Warning Attach stabilizers to the rack in single rack installations Failure to do so can result in serious injury or death Couple racks together in multiple rack installations Failure to do so can result in serious injury or death Warning Be sure the rack is stable before extending a component fro...

Page 25: ...splay work place devices under the German government ordinance for work with visual display units Rack Mounting Considerations Use the guidelines provided in the following subsections to properly install the server in a rack Ambient Operating Temperature If installed in a closed or multi unit rack assembly the ambient operating temperature of the rack environment may be greater than the ambient te...

Page 26: ...to Supply Power to the System on page 18 There are a variety of rack units on the market which may mean the assembly procedure will differ slightly You should also refer to the installation instructions that came with the rack unit you are using Note that this system s rail kit is designed to fit a rack between 26 in and 33 5 in deep Separate the Sections of the Rack Rails The chassis package incl...

Page 27: ...Install the System into a Rack 007 5841 001 13 Figure 2 2 Separating the Rack Rails ...

Page 28: ... rail See the placement example in Figure 2 3 3 Hang the hooks of the front of the outer rail onto the slots on the front of the rack Use screws to secure the outer rails to the rack 4 Pull out and adjust both the short and long brackets to the proper distance so that the rail can fit snugly into the rack reference Figure 2 4 on page 15 5 Hang the hooks of rear portion of the outer rail into the s...

Page 29: ...nsions In most cases the inner rails are pre attached to the chassis and do not interfere with normal use of the chassis if you decide not to use a server rack The inner rail extension is attached to the inner rail to mount the chassis in the rack Using the Rail Locking Tabs Both chassis rails have a locking tab which serves two functions 2 3 4 Extending the Rails Quick Release Tab Separating the ...

Page 30: ... server weighs up to 57 lbs 25 9 kg Always use proper lifting techniques when your move the server Always get the assistance of another qualified person when you install the sever in a location above your shoulders Failure to do so may result in serious personal injury or damage to the equipment 1 Extend the outer rails as shown in Figure 2 5 on page 17 2 Align the inner rails of the chassis with ...

Page 31: ...Install the System into a Rack 007 5841 001 17 Figure 2 5 Installing the Server in the Rack ...

Page 32: ...a power strip or power distribution unit PDU within the rack An optionally available uninterruptible power supply UPS can ensure continued operation in case of a failure of the regular power source After all power connections are verified push the power on button on the front of the server when you wish to power on the unit ...

Page 33: ...the drive carriers and power supplies to keep you constantly informed of the overall status of the system See Figure 3 1 for an example of the front control panel These LEDs provide constant information on the system and on the overall health of system components Figure 3 1 System Front Control Panel Indicator Components 2 1 ...

Page 34: ...f the chassis a reset button and a power on off button Use the reset button to reboot the system as shown in Figure 3 2 Figure 3 2 System Reset Button Figure 3 3 shows the main power button which is used to apply or turn off the main system power Turning off system power with this button removes the main power but keeps standby power supplied to the system Figure 3 3 System Power On Button ...

Page 35: ... is operating normally Figure 3 4 Power Fail LED Overheat Fan Fail UID LED When the red overheat fan UID LED flashes shown in Figure 3 5 it indicates a fan failure When on continuously it indicates an overheat condition which may be caused by cables obstructing the airflow in the system or the ambient room temperature being too warm Check the routing of the cables and make sure all fans are presen...

Page 36: ...port see Figure 3 6 Figure 3 6 LAN1 Network Activity NIC1 LED NIC2 When flashing the NIC2 LED indicates network activity on the LAN2 port see Figure 3 7 Figure 3 7 LAN2 Network Activity NIC2 LED HDD The HDD LED indicates hard drive activity when flashing see Figure 3 8 Figure 3 8 Hard Drive Activity LED 1 2 ...

Page 37: ...lowing two paragraphs Green When illuminated the green LED on the drive carrier indicates drive activity A connection to the drive backplane enables this LED to blink on and off when that particular drive is being accessed Please refer to Chapter 6 for instructions on replacing failed drives Red When this LED is flashing it indicates that a raided drive is rebuilding A solidly lit red LED indicate...

Page 38: ......

Page 39: ...irst power down the operating system and then unplug the power cords The unit can have more than one power supply cord Disconnect two power supply cords before servicing to avoid electrical shock When working around exposed electrical circuits another person who is familiar with the power off controls should be nearby to switch off the power if necessary Use only one hand when working with powered...

Page 40: ...a public landfill Dispose of used batteries according to the manufacturer s instructions and in compliance with the regulations set up by your local hazardous waste management agency ESD Precautions Caution This server contains electronic components and printed circuit boards which are susceptible to electrostatic discharge ESD damage ESD is generated by two objects with different electrical charg...

Page 41: ...omes with self resetting PTC Positive Temperature Coefficient fuses on the serverboard they must be replaced by trained service technicians only The new fuse must be the same or equivalent as the one replaced Contact your technical support organization for details and support General Safety Precautions Follow these rules to ensure general safety Keep the area around the Rackable C2110G RP5 system ...

Page 42: ...nductors that can create short circuits and harm you if they come into contact with printed circuit boards or areas where power is present After accessing the inside of the system close the system back up and secure it to the rack unit with the retention screws after ensuring that all connections have been made ...

Page 43: ... a node board the MAC Ethernet address changes If you are using such a product you or your service representative must request a new license key after replacement of a node board Contact your local customer support office http www sgi com support supportcenters html Caution Install the chassis cover after you have completed accessing the components inside the server to maintain proper airflow and ...

Page 44: ...When handling chips or modules avoid touching the pins Store PCIe cards or other boards and components in antistatic bags when not in use Make sure your computer chassis provides a conductive path between the power supply the case the mounting fasteners and the node board to chassis ground Unpacking Caution System options are shipped in antistatic packaging to avoid electrostatic discharge damage ...

Page 45: ...ails The 2U C2110G RP5 system chassis has one node board The C2110G RP5 board is configured with two processors When configured with two processors the following rules apply Both processor sockets must have identical revisions core voltage and bus core speed The stepping between the processors on the board must be identical See Figure 5 2 on page 34 for CPU locations on the serverboard note that t...

Page 46: ... support representative PCIe Expansion Slots Two PCI Express PCIe slots with the following features One PCIe Gen 3 0 x8 low profile card in x16 slot One PCIe Gen 2 0 x4 standard card in x16 slot System Health Monitoring Onboard voltage monitors Fan status monitor with firmware software on off and speed control Watch Dog Environmental temperature monitoring via BIOS Power up mode control for recove...

Page 47: ...and 10 software RAID are supported using controllers on the system serverboard Note RAID 5 and RAID 6 functionality is supported with the use of optional hardware Two 2 USB Universal Serial Bus 2 0 ports rear external type A Two 2 LAN ports supported by an onboard Intel Ethernet controller for 10 100 1000Base T One 1 dedicated RJ 45 IPMI LAN port One 1 VGA port supported by an onboard Matrox G200 ...

Page 48: ...B1 J29 JPBR1 JPME1 JWP1 JPG1 FAN2 FAN1 FANF FAND FANH FANC FANG FANE FAN4 FAN3 FANA FANB T SGPIO5 T SGPIO1T SGPIO2 I SATA3 I SATA4 I SATA5 JF1 JTPM1 JPW2 USB 0 1 IPMI LAN PCH Slot6 PCI E 2 0 x4 CPU1 Slot1 PCI E 3 0 x8 PCI E 3 0 X16 CPU2 Slot4 PCI E 3 0 X16 CPU2 Slot 3 PCI E 3 0 X16 CPU1Slot1PCI E 3 0 X16 P2 DIMME P2 DIMMF P2 DIMMG P2 DIMMH P1 DIMMD P1 DIMMC P1 DIMMB P1 DIMMA CPU2 BIOS JPW6 PHY I S...

Page 49: ...m Disk Drive Locations Drive Configurations The disk drive configurations supported in the Rackable C2110G RP5 server are outlined in the paragraphs that follow Note that some configurations are dependent on use of optional hardware to support RAID configurations The supported disk drive configurations are as follows JBOD This non RAID disk array supports any number of drives between one and 10 Th...

Page 50: ...k striping parity distribution by creating enough parity data to handle two disk failures You can lose a disk and have an unrecoverable error URE during reconstruction and still reconstruct your system data Calculations for the RAID 6 parity stripes are more complicated than those for RAID 5 virtually doubling the workload for the processor on the RAID controller and exacting a performance penalty...

Page 51: ...ds will fit and function in the internal PCIe GPU slots contact your SGI sales or service representative for information on approved GPU cards Power Supply Functional Rating The C2110G RP5 server default configuration is two rear installed 1800 watt power supplies The second power supply acts as a redundant power unit for the server The supplies are auto ranging and can operate from either 100 140...

Page 52: ......

Page 53: ...r up when the front power button is pushed use the following checklist to identify common sources for the problem Make sure that both ends of each system power cable are firmly connected to the power supply and the corresponding power source s or power distribution unit PDU Check to see if the power fail LED is lit on the front of the unit This LED should be off if the system is operating normally...

Page 54: ...ed Listen for a BIOS beep code error message one long beep plus 8 short beeps indicates a video error This beep code message could indicate a video memory error or other video malfunction contact your service provider If using an optional PCIe video card check the back of the card for LED activity or a fault indicator Try opening the system reseating the PCIe card and rebooting see the section Ins...

Page 55: ...ng or installing any internal hardware components Tools Required The only tool you will need to install components and perform maintenance is a Phillips screwdriver Static Sensitive Devices Electrostatic discharge ESD can damage electronic components To prevent damage to any printed circuit boards PCBs it is important to handle them very carefully The following measures are generally sufficient to...

Page 56: ... from JF1 on the serverboard to the Control Panel PCB printed circuit board Make sure the red wire plugs into pin 1 on both connectors Pull all excess cabling out of the airflow path The LEDs inform you of system status See Chapter 3 for details on the LEDs and the control panel buttons Details on JF1 can be found in Chapter 5 Drive Bay Installation Removal This section describes hard drive instal...

Page 57: ...erating system you use must have RAID support to enable the hot swap capability of the hard drives The backplane is preconfigured so no jumper switch configuration is required Caution Be careful when working around the drive backplane Do not touch the backplane with your fingers or any metal objects and make sure no ribbon cables touch the backplane or obstruct the holes which aid in proper airflo...

Page 58: ...rive to the hard drive carrier 2 Insert a new replacement hard drive into the carrier with the PCB side facing down and the connector end toward the rear of the carrier 3 Align the hard drive in the disk drive carrier so that the mounting holes of the carrier are aligned with the mounting holes of the drive Note that there are holes in the carrier which are marked SATA to aid in correct installati...

Page 59: ...ier oriented so that the release button is on the bottom When the carrier reaches the rear of the drive bay the handle will retract 6 Using your thumb push against the upper part of the hard drive handle until the assembly clicks into the locked fully seated position see Figure 6 3 on page 47 for an example Note Your operating system must have RAID support to enable the hot plug capability of the ...

Page 60: ...46 007 5841 001 6 Basic Troubleshooting and Chassis Service Figure 6 2 Drive Carrier Attachment to Dummy Drive Blank ...

Page 61: ...on Example Power Supply The system offers a redundant power supply assembly consisting of two 1800 watt power modules Each power supply module has an auto switching capability which enables it to automatically sense and operate at a 100V 240V input voltage at 50 or 60Hz ...

Page 62: ...acing a Power Supply You do not need to shut down the system to replace a failed power supply unit The backup power supply module will keep the system up and running while you replace the failed unit Replace with the same model Removing the Power Supply 1 First unplug the AC power cord from the failed power supply module 2 Depress the locking tab on the power supply module 3 Pull it straight out u...

Page 63: ...Power Supply 007 5841 001 49 Figure 6 4 Power Supply Remove Replace Example ...

Page 64: ... swap fans provide the cooling for the system It is very important that the chassis top cover is properly installed and making a good seal in order for the cooling air to circulate properly through the chassis and cool the components System Fan Failure Fan speed is controlled by system temperature via a BIOS setting If a fan fails the remaining fans will ramp up to full speed and the overheat fan ...

Page 65: ... the example in Figure 6 5 on page 51 2 Remove the screws securing the fan housing to the floor of the chassis See the illustrations below to determine the location of the screws for the fan that is being removed Set these screws aside for later use Figure 6 5 Front System Cooling Fan Access Example ...

Page 66: ...n housing up and out of the chassis see Figure 6 6 for the GPU fan extraction example See Figure 6 7 on page 53 for the mid chassis fan extraction location 4 Disconnect the wiring to the fan assembly and lift it away from the server Figure 6 6 Remove Replace GPU Fan Example ...

Page 67: ...drawings as a guide to disassemble the fan housing by removing the front and rear fan guards Mid chassis Fans 1 Replace the failed fan with an identical 8 cm 12 volt fan 2 Disassemble the mid chassis fan housing by removing the front and rear fan guards 3 Reassemble the fan housing around the replacement fan as follows ...

Page 68: ...fan guards to the left and right side clips Figure 6 8 Mid chassis Fan Assembly Example Front and GPU Fans 1 Replace the failed fan with an identical 8 cm 12 volt fan 2 Clip the front and rear fan guards into the left and right side clips 3 Reconnect the fan s power connector before securing it in the chassis with the four screws ...

Page 69: ...System Fans 007 5841 001 55 Figure 6 9 Front GPU Fan Assembly Example ...

Page 70: ... the server 1 Remove the chassis cover and disconnect both the power cables from the server 2 Confirm that you have the correct size and type of PCIe expansion card 3 Remove the full depth PCIe slot cover at the rear of the chassis 4 Undo the three screws holding the PCIe bracket assembly to the system chassis and remove the assembly carefully disconnect the riser card from the serverboard during ...

Page 71: ...ack into the system serverboard slot See Figure 6 12 on page 58 for an example 7 Secure the expansion card bracket to the chassis with the three screws removed in step 4 8 Connect cables to the add on card as necessary replace the system cover and plug in the power cords prior to rebooting the server Figure 6 11 Remove Replace Full height PCIe Card in Bracket ...

Page 72: ...58 007 5841 001 6 Basic Troubleshooting and Chassis Service Figure 6 12 Full height PCIe Bracket Assembly Remove Replace Example ...

Page 73: ...le PCIe slot cover at the rear of the chassis and slide it sideways to remove from the chassis 4 Select the appropriate riser connector for your low profile card Note that the riser card uses a x16 connector 5 Align the PCIe card with the rear slot opening and the riser connector then simultaneously slide the rear bracket into place as you insert the PCIe connector 6 Secure the rear bracket in the...

Page 74: ...60 007 5841 001 6 Basic Troubleshooting and Chassis Service Figure 6 13 Low profile PCIe Card Remove Replace Example ...

Page 75: ...atal error occurs you should consult with your system manufacturer for possible repairs These fatal errors are usually communicated through a series of audible beeps The numbers on the fatal error list see Table A 1 correspond to the number of beeps for the corresponding error Table A 1 BIOS Error Codes Beep Code Error Message Description 1 beep Refresh Circuits have been reset Ready to power up 5...

Page 76: ......

Page 77: ...o 35º C 32º to 95º F Non operating Temperature 40º to 70º C 40º to 158º F Operating Relative Humidity 8 to 90 non condensing Non operating Relative Humidity 5 to 95 non condensing System Input Requirements AC Input Voltage 180 240 VAC Rated Input Current 1000W 100 120V 12 10A 1200W 120 140V 12 10A 1800W 200 240V 10 8 5A Rated Input Frequency 50 60 Hz Power Supply Rated Output Power 1800W Rated Out...

Page 78: ... 2 3 3 CISPR 22 Class A Electromagnetic Immunity EN 55024 CISPR 24 EN 61000 4 2 EN 61000 4 3 EN 61000 4 4 EN 61000 4 5 EN 61000 4 6 EN 61000 4 8 EN 61000 4 11 Safety CSA EN IEC UL 60950 1 Compliant UL or CSA Listed USA and Canada CE Marking Europe California Best Management Practices Regulations for Perchlorate Materials This Perchlorate warning applies only to products containing CR Manganese Dio...

Reviews: