background image

Sun Fire X4640 Server Diagnostics Guide

Part No: 821–0472
December 2010, Rev A

Summary of Contents for SUN Fire X4640

Page 1: ...Sun Fire X4640 Server Diagnostics Guide Part No 821 0472 December 2010 Rev A ...

Page 2: ...ions d utilisation et de divulgation Sauf disposition de votre contrat de licence ou de la loi vous ne pouvez pas copier reproduire traduire diffuser modifier breveter transmettre distribuer exposer exécuter publier ou afficher le logiciel même partiellement sous quelque forme et par quelque procédé que ce soit Par ailleurs il est interdit de procéder à toute ingénierie inverse du logiciel de le d...

Page 3: ...the Outside of the Server 12 How to Inspect the Inside of the Server 12 Troubleshooting DIMM Problems 15 DIMM Fault LEDs 15 DIMM Population Rules 17 How to Isolate and Correct DIMM ECC Errors 17 Identifying Correctable DIMM Errors CEs 19 Identifying BIOS DIMM Error Messages 21 Using the ILOM to Monitor the Host 23 Viewing the ILOM Sensor Readings 23 Viewing the ILOM System Event Log 26 Clearing th...

Page 4: ... a Snapshot With the ILOM Web Interface 37 How To Create a Snapshot With the ILOM Command Line Interface 39 Resetting the SP 41 How to Reset the ILOM SP Using the Web Interface 41 How to Reset the ILOM SP Using the Command Line Interface 42 Index 43 Contents Sun Fire X4640 Server Diagnostics Guide December 2010 Rev A 4 ...

Page 5: ...and firmware Standalone software common across multiple types of hardware This includes the Hardware Management Pack and Hardware Management Connectors Get Software and Firmware Downloads Go to http support oracle com Sign in to My Oracle Support At the top of the page click the Patches and Updates tab In the Patch Search box click Product or Family Advanced Search In the Product field type a full...

Page 6: ...re your comments go to http www oracle com goto docfeedback Change History The following changes have been made to the documentation set October 2009 initial publication January 2010 two documents revised Service Manual Revised DIMM population rules and addressed illustration issues Product Notes Revised software information and fixed bugs April 2010 one document revised Installation Guide Revised...

Page 7: ...Troubleshoot system problems Troubleshooting the Server on page 11 Troubleshoot DIMM problems Troubleshooting DIMM Problems on page 15 Use ILOM to monitor the host Using the ILOM to Monitor the Host on page 23 Use SunVTS to diagnose server problems Using SunVTS Diagnostics Software on page 33 Create a data collector snapshot Creating a Data Collector Snapshot on page 37 Reset the service processor...

Page 8: ...8 ...

Page 9: ...age 11 2 Investigate any power on problems How to Troubleshoot Power Problems on page 11 3 Perform external visual inspection and internal visual inspection How to Inspect the Outside of the Server on page 12 How to Inspect the Inside of the Server on page 12 4 Troubleshoot DIMM problems Troubleshooting DIMM Problems on page 15 5 View BIOS event logs and POST messages Sun Fire X4640 Server Service...

Page 10: ...ce Processor ILOM You can use the Integrated Lights Out Manager ILOM to diagnose system problems by viewing the following View component information to determine component status View the ILOM system event log For more information on using the ILOM to diagnose system issues see Using the ILOM to Monitor the Host on page 23 SunVTS Diagnostics SunVTS is the Sun Validation Test Suite which provides a...

Page 11: ...ntly installed or moved How long the server exhibited symptoms The duration or frequency of the problem Document the server settings before you make any changes If possible make one change at a time in order to isolate potential problems In this way you can maintain a controlled environment and reduce the scope of troubleshooting Take note of the results of any change you make Include any errors o...

Page 12: ...e of the Server Prepare the server for service See Preparing the Server for Service and Operation in Sun Fire X4640 Server Service Manual Choose a method for shutting down the server from main power mode to standby power mode Graceful shutdown Use a ballpoint pen or other nonconducting stylus to press and release the Power button on the front panel This causes Advanced Configuration and Power Inte...

Page 13: ...e X4640 Server Service Manual Inspect the internal status indicator LEDs which can indicate component malfunction Note The server must be in standby power mode to view the internal LEDs For the LED locations and descriptions of their behavior see Troubleshooting DIMM Problems on page 15 Note You can hold down the Locate button on the server back panel or front panel for 5 seconds to initiate a pus...

Page 14: ...r To restore main power mode to the server all components powered on use a ballpoint pen or other nonconducting stylus to press and release the Power button on the server front panel When main power is applied to the full server the Power OK LED next to the Power button blinks intermittently till BIOS post finishes If the problem with the server is not evident you can try viewing the power on self...

Page 15: ...n Rules on page 17 How to Isolate and Correct DIMM ECC Errors on page 17 Identifying Correctable DIMM Errors CEs on page 19 Identifying BIOS DIMM Error Messages on page 21 DIMM Fault LEDs In the Sun Fire X4640 servers eight DIMM slots are on each removable CPU module The DIMM fault LEDs in the DIMM slot ejector levers indicate which DIMM pair has failed These DIMM fault LEDs can be lit for up to o...

Page 16: ...ector The CPU fault LED indicates which CPU module contains the faulty DIMM To light the fault LED from the capacitor push the small button on the CPU module labelled FAULT REMIND BUTTON The DIMM ejector levers contain LEDs that can indicate a faulty DIMM 4 3 5 7 6 1 2 DIMM Fault LEDs Sun Fire X4640 Server Diagnostics Guide December 2010 Rev A 16 ...

Page 17: ...any unpainted metal surface The system s printed circuit boards and hard disk drives contain components that are extremely sensitive to static electricity If you have not already done so shut down your server to standby power mode and remove the cover Refer to the Sun Fire X4640 Server Service Manual Inspect the CPU fault LEDs for each CPU module The CPU fault LED will be lit on the CPU module tha...

Page 18: ...ts 1 and 0 b Reinstall the DIMM from slot 1 into slot 0 c Reinstall the DIMM from slot 0 into slot 1 Reinstall the CPU module that has the DIMM problem Refer to the Sun Fire X4640 Server Service Manual Reconnect AC power cords to the server Power on the server and run the diagnostics test again Review the log file If the error now appears in CPU0 slot 0 the opposite of the original error in slot 1...

Page 19: ...installed the problem is with the DIMMs Return both DIMMs the pair to the Support Center for replacement If the error remains with the original CPU there is a problem with that CPU module Identifying Correctable DIMM Errors CEs CEs rarely occur therefore during a short POST the BIOS might not be able to catch a CE to log it in the server s IPMI SEL system event logs Memory Correctable Errors are u...

Page 20: ... system events as follows a A Machine Check error message pops up on the task bar b Manually go into the EventViewer s System Events to view errors Access the Event Viewer through this menu path Start AdministrationTools EventViewer System events list c View individual errors by right clicking on the event and selecting Properties to see details of the error d Save the complete logs through this m...

Page 21: ...x03 NODE n DIMMs Manufacturer Mismatch The DIMM manufacturer is not supported or recognized Memory 0x04 NODE n single DIMM slot is left unpopulated The DIMM slot z of processor y is left unpopulated while it s pairing slot has a DIMM installed In addition the following error message is displayed to the screen only not in the SEL NODE n Memory Configuration Mismatch The following conditions cause t...

Page 22: ...22 ...

Page 23: ...n Sun ILOM 3 0 Supplement for the Sun Fire X4640 Server for more information about the sensors This section contains the following procedures How to Use the ILOM Web Interface to View the Sensor Readings on page 23 How to Use the ILOM Command Line Interface to View the Sensor Readings on page 25 How to Use the ILOMWeb Interface toView the Sensor Readings To view sensor readings you need the Read O...

Page 24: ...s appear Note If the server is powered off many components will have no readings In the Sensor Readings page do the following a Locate the name of the sensor you want to view 2 3 Viewing the ILOM Sensor Readings Sun Fire X4640 Server Diagnostics Guide December 2010 Rev A 24 ...

Page 25: ...tion continue with Using SunVTS Diagnostics Software on page 33 How to Use the ILOM Command Line Interface toView the Sensor Readings To view sensor readings you need the Read Only o role enabled Log in to the ILOM CLI Type the following commands to navigate to the sensor target and then to view the sensor properties cd target show For example on some server platforms you can specify the following...

Page 26: ... access them see Sun ILOM 3 0 Supplement for the Sun Fire X4640 Server Viewing the ILOM System Event Log This section contains the following procedures How to View the System Event Log Using the ILOM Web Interface on page 26 How to View the System Event Log With the ILOM Command Line Interface on page 28 How toView the System Event Log Using the ILOMWeb Interface Events are notifications that occu...

Page 27: ...Default password changeme From the System Monitoring tab select Event Logs The System Event Logs page appears View the Event Log page in one of the following ways Page through entries Use the page navigation controls at the top and bottom of the table to navigate forward and back through the available data in the table 2 3 Viewing the ILOM System Event Log 27 ...

Page 28: ...eters button pushed Severity Debug Down Critical Major or Minor Date Time The day and time the event occurred If the Network Time Protocol NTP server is enabled to set the ILOM time the ILOM clock uses Universal Coordinated Time UTC Description A description of the event Note The ILOM event log accumulates many types of events including copies of IPMI entries Clearing the ILOM event log clears all...

Page 29: ...18 19 27 2009 Audit Log minor 66251 Open Session object session type value www error 96872 Fri Aug 7 18 14 47 2009 Audit Log minor root Close Session object session type value www success 96871 Fri Aug 7 17 07 39 2009 Audit Log minor root Open Session object session type value shell success 96870 Fri Aug 7 16 52 03 2009 Audit Log minor root Open Session object session type value www success 96869 ...

Page 30: ...is enabled to set the ILOM time the ILOM clock uses Universal Coordinated Time UTC Description A description of the event To dismiss the event log stop displaying the log press the q key Clearing the Faults from the System Event Log This section contains the following procedures How to Clear Faults From the System Event Log Using the ILOM Web Interface on page 30 How to Clear Faults From the Syste...

Page 31: ...mps in the event log are related to the service processor clock settings If the clock settings change the change is reflected in the time stamps When the service processor reboots the SP clock is set to Thu Jan 1 00 00 00 UTC 1970 The SP reboots as a result of the following A complete system unplug replug power cycle An IPMI command for example mc reset cold A command line interface CLI command fo...

Page 32: ...system sets the host s RTC The BIOS does not consider time zones Solaris and Linux software respect time zones and set the system clock to UTC Therefore after the OS adjusts the RTC the time set by the BIOS is UTC When the user sets the RTC using the host BIOS Setup screen Continuously through NTP if NTP is enabled on the SP NTP jumping is enabled to recover quickly from an erroneous update from t...

Page 33: ...at contains Sun VTS software SunVTS provides a comprehensive diagnostic tool that tests and validates Sun hardware by verifying the connectivity and functionality of most hardware controllers and devices on Sun platforms SunVTS software can be tailored with modifiable test instances and processor affinity features The following tests are available in SunVTS Processor Memory Disk Graphics Media Iop...

Page 34: ...tics With the server powered on insert the bootable diagnostics CD into the CD DVD drive Reboot the server but press F2 during the start of the reboot so that you can change the BIOS setting for boot device priority When the BIOS Main menu appears navigate to the BIOS Boot menu Instructions for navigating within the BIOS screens are printed on the BIOS screens On the BIOS Boot menu screen select B...

Page 35: ...ins informative messages that are generated when you start and stop the SunVTS test sessions The log file path name is var sunvts logs sunvts info This file is not created until a SunVTS test session runs Solaris system message log a log of all the general Solaris events logged by syslogd The path name of this log file is var adm messages To view a log file a Click the Log button The log file wind...

Page 36: ...edia device When you use the Bootable Diagnostics CD the server boots from the CD Therefore the test log files are not on the server s hard disk drive and they will be deleted when you power cycle the server SunVTS Documentation Sun Fire X4640 Server Diagnostics Guide December 2010 Rev A 36 ...

Page 37: ...tion contains the following procedures How To Create a Snapshot With the ILOM Web Interface on page 37 How To Create a Snapshot With the ILOM Command Line Interface on page 39 HowTo Create a SnapshotWith the ILOMWeb Interface Caution Customers should not run this utility unless requested to do so by Sun Services To collect SP data using the Service Snapshot utility you need the Admin a role enable...

Page 38: ...ight reset the system Custom Allows you to choose one or more of the following data sets ILOM Data Hardware Data Basic OS Data Diagnostic Data Optional Check the Enabled to collect only log files from the data set Optional Check Enabled check box to encrypt the output file Select one of the following methods to transfer the output file Browser SFTP FTP 2 3 4 5 6 Creating a Data Collector Snapshot ...

Page 39: ...ot utility you need the Admin a role enabled Log in to the ILOM CLI Type the following commands set SP diag snapshot dataset data set SP diag snapshot dump_uri URI Where data and URI are one of the following Value Option Header data normal Specifies that ILOM operating system and hardware information is collected full Specifies that all data is collected full collection Note Using this option migh...

Page 40: ... be one of these transfer methods SFTP or FTP For example to store the snapshot information in the directory named data on the host define the URI as follows ftp joe mypasswd host_ip_address data The directory data is relative to the user s login so the directory would probably be home joe data Creating a Data Collector Snapshot Sun Fire X4640 Server Diagnostics Guide December 2010 Rev A 40 ...

Page 41: ... Web Interface on page 41 How to Reset the ILOM SP Using the Command Line Interface on page 42 How to Reset the ILOM SP Using theWeb Interface To reset the SP you need the Reset and Host Control r role enabled After updating the ILOM BIOS firmware you must reset the ILOM SP Log in to the ILOM SP web interface Select Maintenance Reset SP The Reset service processor page appears Click the Reset SP b...

Page 42: ...le enabled After updating the ILOM BIOS firmware you must reset the ILOM SP Log in to the ILOM CLI Type the following command reset SP The ILOM reboots The command line interface is unavailable while the ILOM reboots BeforeYou Begin 1 2 Resetting the SP Sun Fire X4640 Server Diagnostics Guide December 2010 Rev A 42 ...

Page 43: ...ng 17 19 DIMM fault LEDs 15 DIMM population rules 17 DIMM troubleshooting 15 21 E emergency shutdown 12 externally inspecting the server 12 F fan sensor readings 23 32 finding your product on My Oracle Support support oracle com 5 6 G gathering service visit information 11 graceful shutdown 12 guidelines for troubleshooting 11 I ILOM description 10 sensor readings 23 32 system event log 26 time st...

Page 44: ...ngs 23 32 using the ILOM command line interface 25 26 using the ILOM web interface 23 25 Service Processor ILOM description 10 service visit information gathering 11 shutdown procedure 12 snapshot creating with the ILOM command line interface 39 40 creating with the ILOM web interface 37 39 SP SEL time stamps 31 SunVTS description 10 SunVTS diagnostics software 33 36 documentation 34 introduction ...

Reviews: