background image

If you encounter this situation, verify the following items:

v

That an

satask_result.html

file is in the root directory on the USB key. If the

file does not exist, then the following problems are possible:
– The USB key is not formatted with the correct file system type. Use any USB

key that is formatted with FAT32, EXT2, or EXT3 file systems on its first
partition; for example, NTFS is not a supported type. Reformat the key or use
a different key.

– The USB port is not working. Try the key in the other USB port.
– The node is not operational. Check the node status using the LEDs. See

“Procedure: Understanding the system status using the LEDs” on page 198.

v

If there is an

satask_result.html

file, check the first entry in the file. If there is

no entry that matches the time the USB key was used, it is possible that the USB
port is not working or the node is not operational. Check the node status using
the LEDs. See “Procedure: Understanding the system status using the LEDs” on
page 198.

v

If there is a status output for the time the USB key was used, then the

satask.txt

file was not found. Check that the file was named correctly. The

satask.txt

file is automatically deleted after it has been processed.

Procedure: Resetting superuser password

You can reset the superuser password to the default password of

passw0rd

by

using a USB key command action.

You can use this procedure to reset the superuser password if you have forgotten
the password. This command runs differently depending on whether you run it on
a node canister that is active in a clustered system.

Note:

If a node canister is not in active state, the superuser password is still

required to log on to the service assistant.

It is possible to configure your system so that resetting the superuser password
with the USB key command action is not permitted. If your system is configured
this way, there is no work-around. Contact the person who knows the password.

To use a USB key to reset the superuser password, see “USB key and Initialization
tool interface” on page 176.

If the node canister is active in a clustered system, the password for superuser is
changed on the clustered system. If the node canister is not in active state, the
superuser password for the node canister is changed. If the node canister joins a
clustered system later, the superuser password is reset to that of the clustered
system.

Procedure: Identifying which enclosure or canister to service

Use this procedure to identify which enclosure or canister must be serviced.

Because of the differences between the enclosures, you must be able to distinguish
between the control enclosures and the expansion enclosures when you service the
system. Be aware of the following differences:

v

The model type that is shown on the labels. Model types 2076-112, 2076-124,
2076-312, and 2076-324 are control enclosures. Model types 2076-212 and
2076-224 are expansion enclosures.

196

Storwize V7000 Unified: Problem Determination Guide Version

Summary of Contents for Storwize V7000

Page 1: ...IBM Storwize V7000 Unified Version 1 3 Machine Type 2073 and 2076 Problem Determination Guide GA32 1057 04...

Page 2: ...IBM Environmental Notices and User Guide which is provided on a DVD This edition applies to IBM Storwize V7000 Unified Version 1 3 and to all subsequent releases and modifications until otherwise ind...

Page 3: ...x procedures 49 Chapter 4 File module 51 General file module procedures 51 Rebooting a file module 51 Removing a file module to perform a maintenance action 51 Removing and replacing file module compo...

Page 4: ...Removing and replacing parts 208 Preparing to remove and replace parts 209 Replacing a node canister 209 Replacing an expansion canister 211 Replacing an SFP transceiver 212 Replacing a power supply...

Page 5: ...Commission FCC statement 277 Industry Canada compliance statement 278 Avis de conformit la r glementation d Industrie Canada 278 Australia and New Zealand Class A Statement 278 European Union Electro...

Page 6: ...vi Storwize V7000 Unified Problem Determination Guide Version...

Page 7: ...AID M5000 advanced feature key and M5014 adapter 122 30 ServeRAID M1000 advanced feature key and M1015 adapter 123 31 ServeRAID M5000 advanced feature key and M5014 adapter 124 32 Releasing the batter...

Page 8: ...viii Storwize V7000 Unified Problem Determination Guide Version...

Page 9: ...battery LEDs 46 21 Status of volume 59 22 State of drives 60 23 SMART ASC ASCQ error codes and messages 65 24 Error code information 76 25 Originating role information 76 26 Ethernet role and port re...

Page 10: ...x Storwize V7000 Unified Problem Determination Guide Version...

Page 11: ...ausing moderate or minor personal injury C001 DANGER A danger notice indicates the presence of a hazard that has the potential of causing death or serious personal injury D002 2 Locate IBM Systems Saf...

Page 12: ...r Installation dieses Produkts die Sicherheitshinweise lesen Prima di installare questo prodotto leggere le Informazioni sulla Sicurezza Les sikkerhetsinformasjonen Safety Information f r du installer...

Page 13: ...onfiguration of this product during an electrical storm v Connect all power cords to a properly wired and grounded electrical outlet v Connect to properly wired outlets any equipment that will be atta...

Page 14: ...han 100 C 212 F v Repair or disassemble Dispose of the battery as required by local ordinances or regulations Statement 3 CAUTION When laser products such as CD ROMs DVD drives fiber optic devices or...

Page 15: ...serlaite Appareil A Laser de Classe 1 Statement 4 18 kg 39 7 lb 32 kg 70 5 lb 55 kg 121 2 lb CAUTION Use safe practices when lifting Statement 5 CAUTION The power control button on the device and the...

Page 16: ...h one of these parts contact a service technician Statement 26 CAUTION Do not place any object on top of rack mounted devices This node is suitable for use on an IT power distribution system whose max...

Page 17: ...nd pressure Attention Depending on local conditions the sound pressure can exceed 85 dB A during service operations In such cases wear appropriate hearing protection Safety and environmental notices x...

Page 18: ...xviii Storwize V7000 Unified Problem Determination Guide Version...

Page 19: ...ts menu items Bold monospace Text in bold monospace represents command names Italics Text in italics is used to emphasize a word In command syntax it is used for variables for which you supply actual...

Page 20: ...describes verifying your order becoming familiar with the hardware components and meeting environmental requirements The second chapter describes installing the hardware and attaching data cables and...

Page 21: ...is multilingual document provides information about the IBM warranty for machine type 2073 Part number 00L4547 IBM License Agreement for Machine Code This multilingual guide contains the License Agree...

Page 22: ...You can also order publications The publications center displays prices in your local currency You can access the IBM Publications Center through the following website www ibm com e business linkweb...

Page 23: ...Page table or illustration numbers that you are commenting on A detailed description of any information that should be changed About this guide xxiii...

Page 24: ...xxiv Storwize V7000 Unified Problem Determination Guide Version...

Page 25: ...s Ethernet capability These are the control enclosure models v Machine type and model 2076 112 which can hold up to 12 3 5 inch drives v Machine type and model 2076 124 which can hold up to 24 2 5 inc...

Page 26: ...trol enclosures have Ethernet ports Fibre Channel ports and USB ports Expansion enclosures do not have any of these ports v The number of LEDs on the power supplies Control enclosure power supplies ha...

Page 27: ...sword might be needed to perform some recovery procedures For security reasons the root password must be changed from its default value of Passw0rd using the chrootpwd CLI command If you lose the root...

Page 28: ...f an enclosure unless instructed to do so If you power off an expansion enclosure you cannot read or write to the drives in that enclosure or to any other expansion enclosure that is attached to it fr...

Page 29: ...ol node canister in the system Download this file regularly to your management workstation to protect the data This file must be used if there is a serious failure that requires you to restore your sy...

Page 30: ...tion in a release plus any issues that have been resolved Update your code regularly if the release notes indicate an issue that you might be exposed to Keep your records up to date Record the locatio...

Page 31: ...pe and the serial number of the enclosure or file module that has the problem The machine type is always 2076 for a control enclosure or 2073 for a file module If the problem does not relate to a spec...

Page 32: ...8 Storwize V7000 Unified Problem Determination Guide Version...

Page 33: ...indicator to see the type of error that is causing the poor health status Select an error type and you are shown the critical errors in the event log First try to fix the critical errors under the Bl...

Page 34: ...state is Active but the CTDB state is not Active see Checking CTDB health on page 160 otherwise see Checking the GPFS file system mount on each file module on page 162 If you have lost access to the f...

Page 35: ...he satask_results html Take the recommended action reboot the node and restart the procedure File module code CD not loading v Check CD for blemishes and clean the problem CD v Reboot the server and t...

Page 36: ...ble serves as a legend for defining the precise action to follow The action legend defines the action that is correlated with each action key Table 5 Installation error code actions Action key Action...

Page 37: ...results txt that could have caused this then refer to File module to control enclosure on page 27 for help troubleshooting the file module to control enclosure management connection Installation error...

Page 38: ...A6 Called with invalid host name A 0AA7 Error sending password A 0AA8 A node name was not provided A 0AA9 Invalid management IP address A 0AAB Invalid RSA IP address A 0AAC Invalid IP for management n...

Page 39: ...e keys on remote host A 0ACD Unable to read in shared user key A 0ACE Unable to read in shared host key A 0ACF Unable to open authorized keys file for reading A 0AD0 Unable to open temp file for writi...

Page 40: ...rom mmlscluster A 0B31 There was an error while attempting to enable CTDB A 0B32 Unable to query current GPFS settings mmlsconfig A 0B33 Unable to open settings file Check logs for more details A 0B34...

Page 41: ...r restoring cluster configuration on node A 0B87 There was an error while adding nodes to the GPFS cluster A 0B88 There was an error while configuring GPFS licensing A 0B89 There was an error while co...

Page 42: ...umber A 0BB0 Unable to open pxeboot data file A 0BB1 Unable to update pxeboot data file for node A 0BB2 Unable to set file permissions A 0BB3 Unable to find node serial in pxeboot data file A 0BB4 Nod...

Page 43: ...V7000 stalled Contact your next level of support 01D6 Storwize V7000 stalled_non_redundant H 01DA GPFS cluster is unhealthy Refer to Checking the GPFS file system mount on each file module on page 162...

Page 44: ...sy Setup Wizard failure DNS errors can cause Easy Setup Wizard to fail with no clear error messages The Easy Setup Wizard process can fail if there are issues with the DNS information entered into the...

Page 45: ...uestion Note If the GUI does not load complete these steps 3 Are you able to initiate an ssh connection to either file node and log in to either file node v Yes a Run the CLI command lsnode and determ...

Page 46: ...connectivity and system reports nothing wrong there might be an issue with the port configuration of your network that is not detected in any of the previous steps The internal management services use...

Page 47: ...overall health status indicators perform the following steps 1 Log on to the management GUI 2 Navigate to Monitoring System Details 3 From the System Details page use the navigation tree on the left t...

Page 48: ...problems Host to file modules connectivity This procedure is used to troubleshoot Ethernet network connectivity between the host and the file modules These network paths are used for all system reque...

Page 49: ...actions to check the port status each time until it is corrected or connected 1 Verify that each end of the cable is securely connected 2 Verify that the port on the Ethernet switch or hub is configu...

Page 50: ...Gbps external network connection Yes Management service and optional file access 4 Built in Ethernet port 4 1 Gbps external network connection No Optional management optional service optional file acc...

Page 51: ...line interface CLI operations to the control enclosure This procedure is used to troubleshoot Ethernet network connectivity between the file modules and the control enclosure These network paths are...

Page 52: ...on the configuration node This is an indication that the node you are currently logged in to is not the configuration node for the system Log out and login using ssh to the other service IP Then issu...

Page 53: ...9 115 160 221 9 115 160 222 9 115 160 220 255 255 248 0 9 115 167 254 EFSSG1000I The command completed successfully You may receive the following error lsnwmgt EFSSG0026I Cannot execute commands becau...

Page 54: ...nister v Follow the hardware replacement procedures for a file module Fibre Channel connectivity between file modules and control enclosure This procedure is used to troubleshoot Fibre Channel connect...

Page 55: ...l port 2 3 Fibre Channel slot 2 port 2 7 Lower canister Fibre Channel port 2 The Storwize V7000 control enclosure contains an upper and lower inverted canister ifs00033 3 4 PCI 3 4 PCI 2 1 3 4 5 6 7 8...

Page 56: ...ways goes to the upper canister and port 2 goes to the lower canister Use the table for correlating the error code with the physical connections and follow the procedures after the table for enabling...

Page 57: ...his state indicates a good connection status Slow flashing amber LED This state indicates a good connection at the Fibre Channel port but a broken connection at the Storwize V7000 node canister This b...

Page 58: ...en the server is turned off provided that the server is still connected to power and the power supply is operating correctly Before you work inside the server to view light path diagnostics LEDs read...

Page 59: ...he light path diagnostics panel is pulled out of the server v Light path diagnostics LEDs remain lit only while the server is connected to power Look at the system service label on the top of the serv...

Page 60: ...Enclosure manager heartbeat LED DIMM 10 18 error LEDs 12v channel error LEDs indicate an overcurrent condition Refer to the procedure Solving power problems in the Troubleshooting the System x3650 in...

Page 61: ...x3650 in the IBM Storwize V7000 Unified Information Center for more information 4 If a voltage regulator has failed replace the system board CNFG A hardware configuration error has occurred This LED...

Page 62: ...level of support FAN A fan has failed is operating too slowly or has been removed The TEMP LED might also be lit 1 Reseat the failing fan which is indicated by a lit LED near the fan connector on the...

Page 63: ...system log replace the failing DIMM which is indicated by the lit DIMM latch on the system board the DIMM LED is underneath the DIMM latch NMI A non maskable interrupt has occurred or the NMI button...

Page 64: ...ED to be lit This condition can also be caused by a room temperature that is too high 1 Check the error log If a fan has failed replace it 2 Make sure that the room temperature is not too high 3 Once...

Page 65: ...technical information hints tips and new device drivers or to submit a request for information Power supply LEDs Description Action Notes AC DC Error Off Off Off No ac power to the server or a proble...

Page 66: ...f a power channel error LED on the system board is not lit replace the power supply See the documentation that comes with the power supply for instructions 3 If a power channel error LED on the system...

Page 67: ...supply unit If failure is still present replace the enclosure chassis Off Off Off Off No ac power to the enclosure Turn on power Off Off Off On The ac power is on but power supply unit is not seated...

Page 68: ...t does not fix the problem replace the enclosure chassis Flashing X X X No canister is operational Both canisters are either off or not seated correctly Turn off the switch on both power supply units...

Page 69: ...an 10 minutes try reseating the canister Go to Procedure Reseating a node canister on page 206 If the state persists follow the hardware replacement procedure for the node canister Table 19 shows the...

Page 70: ...e managed using the service assistant Flashing On Code is active Node state is service The node canister cannot become active in a clustered system Several problems can exist hardware problem a proble...

Page 71: ...ng the SAN volume events and the file system volume events from the control enclosure v A File tab for monitoring the NAS events from the Storwize V7000 file modules When you click the Block tab a Nex...

Page 72: ...management GUI If you suspect a problem use the management GUI first to diagnose and resolve the problem Use the views that are available in the management GUI to verify the status of the system the h...

Page 73: ...fter all the alerts are fixed check the status of your system to ensure that it is operating as intended If you encounter problems logging on the management GUI or connecting to the management GUI see...

Page 74: ...you are not able to complete the actions at this time click Cancel until you return to the previous panel Click Cancel until you are returned to the Next Recommended Actions panel When you return to t...

Page 75: ...IBM Storwize V7000 Unified file module to perform maintenance The procedure that you follow differs slightly depending on whether you must unplug the power cables If you receive an alert event that r...

Page 76: ...described in Removing and replacing parts on page 81 Attention You can replace only one of the disk drives in the file module If you must replace both disk drives contact your next level of support 6...

Page 77: ...39 PM 1 3 0 2 02 SUSPEND active SUSPEND_MAINTENANCE 1 17 12 4 39 PM 4 Pull the file module out from the rack on its rails 5 Locate and use the service ladder if necessary to perform the maintenance a...

Page 78: ...ft a heavy object To avoid straining the muscles in your back lift by standing or by pushing up with your leg muscles v Make sure that you have an adequate number of properly grounded electrical outle...

Page 79: ...might cause the file module to halt which could result in the loss of data To avoid this potential problem always use an electrostatic discharge wrist strap or other grounding system when working insi...

Page 80: ...ng cold weather Heating reduces indoor humidity and increases static electricity Returning a device or component When returning a device or component follow all packaging instructions and use any supp...

Page 81: ...tivity of the LEDs remains the same go to step 4 If the activity of the LEDs changes return to step 1 4 Make sure that the hard disk drive backplane is correctly seated When it is correctly seated the...

Page 82: ...rn off the server b Reseat the SAS controller c Reseat the backplane signal cable backplane power cable and SAS expander card if the server has 12 drive bays d Reseat the hard disk drive e Turn on the...

Page 83: ...Target ID 6 5 Current operation None Physical disk I Os Not quiesced Drive Information Total number of drives found 2 Target on ID 5 Device is a Hard disk Enclosure 1 Slot 1 Connector ID 1 Target ID 5...

Page 84: ...not spin up its block size is incorrect or its media is removable Failed FLD The drive was part of a logical drive or was a hot spare drive and it has failed It has been taken offline Standby SBY This...

Page 85: ...ID 1 Target ID 5 State Ready RDY Size in MB in sectors 286102 585937500 Manufacturer IBM ESXS Model Number XXXXXXXXXXXX Firmware Revision XXXX Serial No XXXXXXXXXXXXXXXXXXXX Drive Type SAS Protocol SA...

Page 86: ...er for information on launching the LSI configuration tool Target on ID 5 Device is a Hard disk Enclosure 1 Slot 1 Connector ID 1 Target ID 5 State Out of Sync OSY Size in MB in sectors 286102 5859375...

Page 87: ...ART ASC ASCQ error codes and messages on page 64 Mirror Information NOTICE The mirror is not created configured Drive Information Total number of drives found 2 Target on ID 4 Device is a Hard disk En...

Page 88: ...ot quiesced Drive Information Total number of drives found 2 Target on ID 6 Device is a Hard disk Enclosure 1 Slot 0 Connector ID 0 Target ID 6 State Online ONL Size in MB in sectors 286102 585937500...

Page 89: ...T NOT READY FORMAT IN PROGRESS 04 05 LOGICAL UNIT NOT READY REBUILD IN PROGRESS 04 06 LOGICAL UNIT NOT READY RECALCULATION IN PROGRESS 04 07 LOGICAL UNIT NOT READY OPERATION IN PROGRESS 04 09 LOGICAL...

Page 90: ...4 COMPRESSION CHECK MISCOMPARE ERROR 0C 05 DATA EXPANSION OCCURRED DURING COMPRESSION 0C 06 BLOCK NOT COMPRESSIBLE 0C 0B AUXILIARY MEMORY WRITE ERROR 0C 0C WRITE ERROR UNEXPECTED UNSOLICITED DATA 0C 0...

Page 91: ...00 RANDOM POSITIONING ERROR 15 01 MECHANICAL POSITIONING ERROR 15 02 POSITIONING ERROR DETECTED BY READ OF MEDIUM 16 00 DATA SYNCHRONIZATION MARK ERROR 16 01 DATA SYNC ERROR DATA REWRITTEN 16 02 DATA...

Page 92: ...LED 20 02 ACCESS DENIED NO ACCESS RIGHTS 20 03 ACCESS DENIED INVALID MGMT ID KEY 20 08 ACCESS DENIED ENROLLMENT CONFLICT 20 09 ACCESS DENIED INVALID LU IDENTIFIER 20 0A ACCESS DENIED INVALID PROXY TOK...

Page 93: ...ED 29 04 DEVICE INTERNAL RESET 29 05 TRANSCEIVER MODE CHANGED TO SINGLE ENDED 29 06 TRANSCEIVER MODE CHANGED TO LVD 29 07 I_T NEXUS LOSS OCCURRED 2A 00 PARAMETERS CHANGED 2A 01 MODE PARAMETERS CHANGED...

Page 94: ...VAILABLE 32 01 DEFECT LIST UPDATE FAILURE 34 00 ENCLOSURE FAILURE 35 00 ENCLOSURE SERVICES FAILURE 35 01 UNSUPPORTED ENCLOSURE FUNCTION 35 02 ENCLOSURE SERVICES UNAVAILABLE 35 03 ENCLOSURE SERVICES TR...

Page 95: ...DIFIED 3F 09 SPARE DELETED 3F 0A VOLUME SET CREATED OR MODIFIED 3F 0B VOLUME SET DELETED 3F 0C VOLUME SET DEASSIGNED 3F 0D VOLUME SET REASSIGNED 3F 0E REPORTED LUNS DATA HAS CHANGED 3F 0F ECHO BUFFER...

Page 96: ...APPED COMMANDS NN TASK TAG 4E 00 OVERLAPPED COMMANDS ATTEMPTED 53 00 MEDIA LOAD OR EJECT FAILED 53 02 MEDIUM REMOVAL PREVENTED 55 01 SYSTEM BUFFER FULL 55 02 INSUFFICIENT RESERVATION RESOURCES 55 03 I...

Page 97: ...OLLER IMPENDING FAILURE GENERAL HARD DRIVE FAILURE 5D 21 CONTROLLER IMPENDING FAILURE DRIVE ERROR RATE TOO HIGH 5D 22 CONTROLLER IMPENDING FAILURE DATA ERROR RATE TOO HIGH 5D 23 CONTROLLER IMPENDING F...

Page 98: ...6 SERVO IMPENDING FAILURE START UNIT TIMES TOO HIGH 5D 47 SERVO IMPENDING FAILURE CHANNEL PARAMETRICS 5D 48 SERVO IMPENDING FAILURE CONTROLLER DETECTED 5D 49 SERVO IMPENDING FAILURE THROUGHPUT PERFORM...

Page 99: ...ATION RETRY COUNT 5D FF FAILURE PREDICTION THRESHOLD EXCEEDED FALSE 5E 00 LOW POWER CONDITION ON 5E 01 IDLE CONDITION ACTIVATED BY TIMER 5E 02 STANDBY CONDITION ACTIVATED BY TIMER 5E 03 IDLE CONDITION...

Page 100: ...using just the ID For an ID of 66012FC for example just search on 66012FC If you get an error code similar to 01E0 use wildcards and search on 01E0 With 66012FC you could use wildcards and search on 6...

Page 101: ...file module specific hardware code code 0 2 4 go to Table 28 v For the originating file module specific software code code 1 3 5 go to Table 29 on page 78 v For the storage enclosure hardware code co...

Page 102: ...4 SoFS 5 winbind 6 multipathd 7 nscd 8 sshd 9 httpd A vsftpd B nmbd C nfsd D cpu E multipath disk Table 30 Storage enclosure hardware code Code 6 C Originating specific software code in sequence ABBC...

Page 103: ...le and the management node Error code and message 210000x The Ethernet port0 has failed and is unresponsive 210001x A network failure was detected between Ethernet port 0 mgmt0sl0 and the management n...

Page 104: ...EFSSP1002C do not search on the entire string but remove the EFS and search on SP1002C or again use wildcards and search on a value such as 1002 The format of system messages is cnnnnx The elements cn...

Page 105: ...r 1 CRUs are your responsibility If IBM installs a Tier 1 CRU at your request you will be charged for the installation Service agreements can be purchased so that you can ask IBM to replace these Remo...

Page 106: ...the ServeRAID SAS controller from the SAS riser card on page 117 Installing a ServeRAID SAS controller in the SAS riser card on page 118 Removing a hot swap hard disk drive on page 119 Installing a h...

Page 107: ...ge 151 Removing the 240 VA safety cover on page 153 Installing the 240 VA safety cover on page 154 Setting the machine type model and serial number on page 155 Removing the cover The following procedu...

Page 108: ...xtended periods of time over 30 minutes with the cover removed might damage file modulefile module components 7 If you are instructed to return the cover follow all packaging instructions and use any...

Page 109: ...with a heavy metal battery or a battery with heavy metal components be aware of the following environmental consideration Batteries and accumulators that contain heavy metals must not be disposed of w...

Page 110: ...left and right side latches and pull the server out of the rack enclosure until both slide rails lock 5 Remove the cover as described in Removing the cover on page 83 6 Disconnect any internal cables...

Page 111: ...For information on disposal of batteries outside the United States go to www ibm com ibm environment products index shtml or contact your local waste disposal facility In the United States IBM has est...

Page 112: ...ecycled at end of life The label on the battery may also include a chemical symbol for the metal concerned in the battery Pb for lead Hg for mercury and Cd for cadmium Users of batteries and accumulat...

Page 113: ...r business partner Statement 2 CAUTION When replacing the lithium battery use only IBM Part Number 33F8354 or an equivalent type battery recommended by the manufacturer If your system has a module con...

Page 114: ...nd the file module Note You must wait approximately 2 5 minutes after you connect the power cord of the file module to an electrical outlet before the power control button becomes active 9 Start the S...

Page 115: ...asp the top of the air baffle and lift the air baffle out of the file module Attention For proper cooling and airflow replace all air baffles before you turn on the file module Operating the file modu...

Page 116: ...n the pin on the bottom of the microprocessor air baffle with the hole on the system board retention bracket 4 Lower the microprocessor 2 air baffle into the file module Attention For proper cooling a...

Page 117: ...ork with some optional devices you must first remove the DIMM air baffle to access certain components or connectors on the system board To remove the DIMM air baffle complete the following steps 1 To...

Page 118: ...that are supplied to you Installing the DIMM air baffle The following procedure is for a Tier 1 customer replaceable unit CRU Replacement of Tier 1 CRUs is your responsibility If IBM installs a Tier...

Page 119: ...fan bracket The following procedure is for a Tier 1 customer replaceable unit CRU Replacement of Tier 1 CRUs is your responsibility If IBM installs a Tier 1 CRU at your request you will be charged fo...

Page 120: ...as described in Removing the DIMM air baffle on page 93 8 Press the fan bracket release latches toward each other and lift the fan bracket out of the server Installing the fan bracket The following p...

Page 121: ...nnect the external cables then reconnect the power cords and turn on the peripheral devices and the file module Note You must wait approximately 2 5 minutes after you connect the power cord of the fil...

Page 122: ...lacement of Tier 1 CRUs is your responsibility If IBM installs a Tier 1 CRU at your request you will be charged for the installation Service agreements can be purchased so that you can ask IBM to repl...

Page 123: ...quest you will be charged for the installation Service agreements can be purchased so that you can ask IBM to replace these units Note Before running a procedure refer to Removing a file module to per...

Page 124: ...V7000 Unified come with one PCI riser card assembly installed If you want to replace them with PCI X riser card assemblies you must order the PCI X riser card assembly option which includes the bracke...

Page 125: ...d by server components 4 Align the PCI riser card assembly with the selected PCI connector on the system board Note The chassis might sag after removing the riser assembly In this case lift up the bot...

Page 126: ...remove an adapter from a PCI expansion slot in a PCI riser card assembly To remove the ServeRAID SAS controller from the SAS riser card see Removing the ServeRAID SAS controller from the SAS riser car...

Page 127: ...ructed to return the adapter follow all packaging instructions and use any packaging materials for shipping that are supplied to you Installing a PCI adapter in a PCI riser card assembly The following...

Page 128: ...rver components 5 Align the PCI riser card assembly with the selected PCI connector on the system board v PCI riser connector 1 Carefully fit the two alignment slots on the side of the assembly onto t...

Page 129: ...n PCI slot 2 The following illustration shows the locations of the adapter expansion slots from the rear of the file module Refer to Installing a PCI adapter in a PCI riser card assembly on page 103 f...

Page 130: ...cedure 1 To help you work safely with Storwize V7000 Unified file modules read the safety information in Safety on page xi Safety statements on page xiii and Installation guidelines on page 54 2 Turn...

Page 131: ...ing a file module to perform a maintenance action on page 51 To install the two port Ethernet adapter complete the following procedure 1 To help you work safely with Storwize V7000 Unified file module...

Page 132: ...e metal clip into the port openings from outside the chassis See Figure 15 on page 109 Rubber stopper Ethernet adapter connector Rubber stopper Figure 13 Location of the rubber stopper on the chassis...

Page 133: ...board then tilt the adapter so that the port connectors on the adapter line up with the port openings on the chassis See Figure 16 11 Slide the port connectors on the adapter into the port openings o...

Page 134: ...for a Tier 1 customer replaceable unit CRU Replacement of Tier 1 CRUs is your responsibility If IBM installs a Tier 1 CRU at your request you will be charged for the installation Service agreements ca...

Page 135: ...placement of Tier 1 CRUs is your responsibility If IBM installs a Tier 1 CRU at your request you will be charged for the installation Service agreements can be purchased so that you can ask IBM to rep...

Page 136: ...riser card and controller assembly for a tape enabled server model complete the following steps Figure 20 shows the SAS riser card in the tape enabled server model a Press down on the assembly releas...

Page 137: ...rocedure 1 To help you work safely with Storwize V7000 Unified file modules read the safety information in Safety on page xi Safety statements on page xiii and Installation guidelines on page 54 2 To...

Page 138: ...the power supplies by pulling up the release tab 1 and sliding the bracket outward 2 See Figure 24 on page 115 sonas203 Figure 22 Controller retention brackets on 16 drive capable server model Tab SA...

Page 139: ...lace See Figure 25 4 Install the controller retention bracket from step i by sliding the bracket inward 1 and pressing down the release tab into place 2 See Figure 26 on page 116 1 2 sonas205 Figure 2...

Page 140: ...tly seated 3 To install the SAS riser card and controller assembly for a tape enabled server model complete the following steps Figure 27 shows the SAS riser card in the tape enabled server model a Al...

Page 141: ...ove the SAS controller the same way as any other PCI adapter Do not use the instructions in this topic Use the instructions in Installing a PCI adapter in a PCI riser card assembly on page 103 To remo...

Page 142: ...xpansion slot cover Adapter 1 To help you work safely with Storwize V7000 Unified file modules read the safety information in Safety on page xi Safety statements on page xiii and Installation guidelin...

Page 143: ...controller cache to write through mode after the battery is fully charged the controller firmware re enables write back mode 12 When you restart the file module import the existing RAID configuration...

Page 144: ...the drive stops 5 Push the tray handle to the closed locked position 6 Since both hard disk drives are part of a mirrored array the array will start to rebuild on the newly installed disk While the ar...

Page 145: ...allation guidelines on page 54 2 Turn off the server and peripheral devices and disconnect the power cords 3 Remove the cover as described in Removing the cover on page 83 4 Grasp the feature key and...

Page 146: ...file module to perform a maintenance action on page 51 1 To help you work safely with Storwize V7000 Unified file modules read the safety information in Safety on page xi Safety statements on page xii...

Page 147: ...ServeRAID M1015 adapter ServeRAID M1000 advanced feature key sonas221 Figure 30 ServeRAID M1000 advanced feature key and M1015 adapter Chapter 4 File module 123...

Page 148: ...form a maintenance action on page 51 To remove a serveRAID SAS controller battery from the remote battery tray complete the following procedure 1 To help you work safely with Storwize V7000 Unified fi...

Page 149: ...attery carrier cable from the battery d Squeeze the clip on the side of the battery and battery carrier to remove the battery from the battery carrier Note If your battery and battery carrier are atta...

Page 150: ...tenance action on page 51 To install the serveRAID SAS controller battery on the remote battery tray complete the following procedure 1 Place the replacement battery on the battery carrier from which...

Page 151: ...drive The following procedure is for a Tier 1 customer replaceable unit CRU Replacement of Tier 1 CRUs is your responsibility If IBM installs a Tier 1 CRU at your request you will be charged for the...

Page 152: ...D RW DVD drive Press the release tab down to release the drive then while pressing the tab pull the drive toward the front of the server 6 From the front of the server carefully pull the drive out of...

Page 153: ...ollow the instructions that come with the drive to set any jumpers or switches 3 If a drive filler panel is in place remove it 4 Attach the drive retention clip to the side of the drive The retention...

Page 154: ...er out of the rack enclosure until both slide rails lock 4 Remove the cover as described in Removing the cover on page 83 5 If riser card assembly 1 contains one or more adapters remove it see Removin...

Page 155: ...wize V7000 Unified System x3650 M2 server and Figure 38 on page 132 for DIMM locations for the Storwize V7000 Unified System x3650 M3 server Figure 37 DIMM locations for the Storwize V7000 Unified Sys...

Page 156: ...2 Remove riser card assembly 1 as described in Removing a PCI riser card assembly on page 99 DIMM 17 DIMM 16 DIMM 15 DIMM 14 DIMM 13 DIMM 12 DIMM 11 DIMM 10 DIMM 8 DIMM 7 DIMM 6 DIMM 5 DIMM 4 DIMM 3 D...

Page 157: ...placement DIMMs are installed 10 Replace the DIMM air baffle as described in Installing the DIMM air baffle on page 94 11 Replace the PCI riser card assemblies as described in Installing a PCI riser c...

Page 158: ...aging instructions and use any packaging materials for shipping that are supplied to you Installing a hot swap fan The following procedure is for a Tier 1 customer replaceable unit CRU Replacement of...

Page 159: ...the new or replacement fans are installed 6 Install the cover as described in Installing the cover on page 84 7 Slide the file module into the rack Removing a hot swap ac power supply The following p...

Page 160: ...can be purchased so that you can ask IBM to replace these units The file module supports a maximum of two hot swap ac power supplies The following notes describe the type of power supply that the ser...

Page 161: ...ng table shows the system status when you install 460 watt power supplies in the server Table 35 System status with 460 watt power supplies installed Total system power consumption in watts Number of...

Page 162: ...er supply filler installed for proper cooling To install an ac power supply complete the following steps 1 To help you work safely with Storwize V7000 Unified file modules read the safety information...

Page 163: ...BM to replace these units To remove the operator information panel assembly complete the following procedure 1 To help you work safely with Storwize V7000 Unified file modules read the safety informat...

Page 164: ...formation in Safety on page xi Safety statements on page xiii and Installation guidelines on page 54 2 Slide the operator information panel assembly into the server until it clicks into place 3 Inside...

Page 165: ...ng components if needed v Microprocessor 1 PCI riser card assembly 1 and DIMM air baffle as described in Removing a PCI riser card assembly on page 99 and Removing the DIMM air baffle on page 93 Note...

Page 166: ...r and heat sink The following procedure is for a field replaceable unit FRU FRUs must be installed only by trained service technicians Read the documentation that comes with the microprocessor to dete...

Page 167: ...ease on page 147 for instructions for replacing the thermal grease then continue with step step 2 of this procedure To install a new or replacement microprocessor complete the following steps The foll...

Page 168: ...with the triangle alignment mark on the microprocessor and then place the microprocessor on the underside of the tool so that the tool can grasp the microprocessor correctly 8 Twist the handle of the...

Page 169: ...et The pins on the socket are fragile Any damage to the pins may require replacing the system board 7 5 6 3 4 1 2 10 Close the microprocessor bracket frame 11 Carefully close the microprocessor releas...

Page 170: ...oving a microprocessor and heat sink v Microprocessor 1 DIMM air baffle and PCI riser card assembly 1 as described in Installing the DIMM air baffle on page 94 and Installing a PCI riser card assembly...

Page 171: ...cleaning pad after all of the thermal grease is removed Microprocessor 0 02 mL of thermal grease 5 Use the thermal grease syringe to place nine uniformly spaced dots of 0 02 mL each on the top of the...

Page 172: ...ser card assembly 2 and microprocessor 2 air baffle as described in Removing a PCI riser card assembly on page 99 and Removing the microprocessor 2 air baffle on page 90 In the following step keep eac...

Page 173: ...es then reconnect the power cords and turn on the peripheral devices and the file module Note You must wait approximately 2 5 minutes after you connect the power cord of the file module to an electric...

Page 174: ...r just enough to disengage them from the server 5 Press down on the left and right side latches and pull the server out of the rack enclosure until both slide rails lock 6 Remove the cover as describe...

Page 175: ...lies Using the lift handle pull the system board out of the server 20 If you are instructed to return the system board follow all packaging instructions and use any packaging materials for shipping th...

Page 176: ...tlet before the power control button becomes active 15 When you replace the system board update the server firmware as described in Restoring System x firmware BIOS settings on page 258 16 From the ac...

Page 177: ...s to take effect 18 Once the system has finished booting complete the following two steps a Discover the IMM IP setting by running the following command gethostip NODENAME imm For example if file modu...

Page 178: ...rials for shipping that are supplied to you Installing the 240 VA safety cover The following procedure is for a field replaceable unit FRU FRUs must be installed only by trained service technicians To...

Page 179: ...ersonnel from the command line interface CLI on the file module Use ASU to modify selected settings in the integrated management module IMM based Storwize V7000 Unified file modules You can use the AS...

Page 180: ...g resetsp a Wait for the IMM reboot to complete typically about 3 minutes If the reboot is successful the output of the previous command will be similar to the following IBM Advanced Settings Utility...

Page 181: ...the service IP address of a file module that hosts a management node role to perform a management failover from the file module that hosts the active management node role to the file module that host...

Page 182: ...he file module hosting the passive management node role Refer to Determining the service IP for the management node roles on page 157 if necessary 2 To initiate the management services on the passive...

Page 183: ...ystem responds with output for the lsnode command then the management services are already running If you still cannot access the GUI refer to If the GUI is accessible then the management services are...

Page 184: ...hardware problem that might have caused this issue Checking CTDB health Use this information for checking system health with the clustered trivial database CTDB CTDB checks the health status of the S...

Page 185: ...orm one or more of the following procedures v Review the health status for any potential network problems A network failure between a file module and the customer can result in an UNHEALTHY CTDB statu...

Page 186: ...the information presented in the previous topics perform the procedure in Recovering a GPFS file system on page 167 Identifying created and mounted file system mounts You can identify and resolve prob...

Page 187: ...dule then reboot each file module 6 Use the lsnode command to determine when the file modules are back up and when GPFS and CTDB are both active The file systems might take several minutes to get moun...

Page 188: ...u cannot resolve the issue contact your authentication server administrator to validate or reestablish your account With regard to server configuration issues refer to Managing authentication server i...

Page 189: ...IBM sofs scproot usr bin rssh EFSSG1000I The command completed successfully When the system is unable to authenticate against an external authentication server you need to make sure that it can obtai...

Page 190: ...mand returns the IP addresses such as 129 42 18 103 above that are configured on the DNS server for Storwize V7000 Unified Ideally these IP addresses should be the same as the addresses configured on...

Page 191: ...ion of IBM support Prerequisites v You are executing this procedure on a file module v You are logged into the file module which is the active management node as root See Accessing a file module as ro...

Page 192: ...ve some missing file system blocks If the only errors that are reported are missing blocks no further repair is needed However if the chkfs command reports more severe errors contact IBM support to as...

Page 193: ...work errors If you encounter issues when viewing the health system refer to the following information and examples If a network is not attached to an interface the health center monitors all ports It...

Page 194: ...v6k mgmt001st001 lsnwmgt Interface Service IP Node1 Service IP Node2 Management IP Network Gateway VLAN ID ethX1 EFSSG1000I The command completed successfully If any ethX1 port cable is unplugged the...

Page 195: ...the latest GPFS information Note The GPFS log is a complex raw log file for GPFS If you are unable to understand the conditions listed in the log contact IBM support Synchronizing time on the file mod...

Page 196: ...r IP and service ntpd start The following example shows the sequence root domain node service ntpd stop Shutting down ntpd OK root domain node ntpdate 9 19 0 220 14 Jan 12 06 46 ntpdate 25360 adjust t...

Page 197: ...service procedures from the service assistant Use the command line interface CLI to manage your system Service assistant interface The service assistant interface is a browser based GUI that is used t...

Page 198: ...ause the node canister to restart It is not possible to maintain the service assistant connection to the node canister when it restarts If the current node canister on which the tasks are performed is...

Page 199: ...age system CLI The storage system CLI is intended for use by advanced users who are confident at using a command line interface Nearly all of the flexibility that is offered by the CLI is available th...

Page 200: ...IP address for the node canister in the control enclosure and must set the address v When you have forgotten the superuser password and must reset the password Using a USB key Use any USB key that is...

Page 201: ...ol prompts you for the task that you want to perform and for the parameters that are relevant to that task It prompts you when to put it in the node canister on the control enclosure When the commands...

Page 202: ...is run on the upper canister the default value is 192 168 70 121 subnet mask 255 255 255 0 If the command is run on the lower canister the default value is 192 168 70 122 subnet mask 255 255 255 0 If...

Page 203: ...e system to disable resetting the superuser password If you disable that function this action fails This command calls the satask resetpassword command Snap command Use this command to collect diagnos...

Page 204: ...a storage system Note The reference to cluster is not the same as the file system cluster on the Storwize V7000 file modules Attention Run this command only when instructed by IBM support Running thi...

Page 205: ...t reporting Events that are detected are saved in an event log As soon as an entry is made in this event log the condition is analyzed If any service activity is required a notification is sent Event...

Page 206: ...the event log The event log has a limited size After it is full newer entries replace entries that are no longer required To avoid having a repeated event that fills the event log some records in the...

Page 207: ...e logged Event notifications Storwize V7000 Unified can use Simple Network Management Protocol SNMP traps syslog messages and Call Home email to notify you and the IBM Support Center when significant...

Page 208: ...he software is not loaded and the fault LED is illuminated To determine if there is a POST error on a file module or a node canister go to Procedure Understanding the system status using the LEDs on p...

Page 209: ...is enough charge in the batteries to support saving critical data from both canisters to a local drive twice In a system with a failed battery there is enough charge in the remaining battery to suppo...

Page 210: ...tical data once therefore both canisters are in active state and I O operations are permitted If one battery fails though the remaining battery has only two thirds of a charge and the total charge tha...

Page 211: ...then neither battery is considered when calculating whether there is sufficient charge to protect the system In these circumstances the system enters service state and does not permit I O operations t...

Page 212: ...If a table does fill up the migration or replication that was creating the bad block fails because it was not possible to create an exact image of the source volume The system creates alerts in the e...

Page 213: ...cing a control enclosure Start here Use the management GUI recommended actions The management GUI provides extensive facilities to help you troubleshoot and correct problems on your system You can con...

Page 214: ...itialization is completed An address for port 2 can be added later If you do not know the storage system management IP address it is part of the data that is shown in the service assistant home panel...

Page 215: ...ny data onto the system Do not delete the clustered system if you have created any volumes on your system because any data on those volumes will be lost In this case you must gain access to the manage...

Page 216: ...the node is in service state fix the reported node errors For more information go to Procedure Fixing node errors on page 204 After the node error is corrected attempt to create a clustered storage s...

Page 217: ...ork See Procedure Finding the status of the Ethernet connections on page 203 for details v Ping the management address to see if the Ethernet network permits the connection If the ping fails check the...

Page 218: ...ere it was previously used Moving a node canister might compromise its access to storage or access to volumes by a host application Do not move the canister from its original location unless directed...

Page 219: ...irements see Problem SAS cabling not valid on page 194 Problem Mirrored volume copies no longer identical The management GUI provides options to either check copies that are identical or to check that...

Page 220: ...set the superuser password if you have forgotten the password This command runs differently depending on whether you run it on a node canister that is active in a clustered system Note If a node canis...

Page 221: ...that relates to the volume Instead the alert relates to the MDisk Performing the fix procedures for the MDisk enables the volume to go online An overview of the status is displayed under Connectivity...

Page 222: ...in one of the USB ports of the node canister from which you want to collect data 3 The node canister fault LED flashes It continues to flash while the information is collected and written to the USB k...

Page 223: ...m If you are unsure which one is the control enclosure go to Procedure Identifying which enclosure or canister to service on page 196 1 Use the state of the ac power failure power supply OK fan failur...

Page 224: ...ce the power cable On Off Off Off Power supply is on and operational No actions Off Off On Off Fan failure Replace the power supply unit Off On On On Communication failure and power supply problem Rep...

Page 225: ...ode canister on page 206 Fast flashing 2 Hz The canister is running its power on self test POST Wait for the test to complete If the canister remains in this state for more than 10 minutes try reseati...

Page 226: ...in the enclosure is in active state it automatically adds this node canister into the clustered system A node canister in this state can be managed using the service assistant Flashing On Code is act...

Page 227: ...the node status Go to Procedure Getting node canister and system information using a USB key on page 198 The status speed and MAC address are returned for each port Information is returned that identi...

Page 228: ...installation Attention This procedure makes all the volume data that you have on your system inaccessible You cannot recover the data This procedure affects all volumes that are managed by your syste...

Page 229: ...rror are listed with the error code Procedure Changing the service IP address of a node canister This procedure identifies many methods that you can use to change the service IP address of a node cani...

Page 230: ...uration options to set the service IP to an address that is accessible on the network Perform the following steps to access a canister using a directly attached Ethernet cable 1 Connect one end of an...

Page 231: ...le starts to move 8 Finish inserting the canister by closing the handle until the locking catch clicks into place 9 Verify that the cables were not displaced 10 Verify that the LEDs are on Procedure P...

Page 232: ...stant Instruction is also given for which package content option is required v If you are collecting the package by using the management GUI select Settings Support Click Download Support Package Foll...

Page 233: ...to data Be careful when you are replacing the hardware components that are located in the back of the system that you do not inadvertently disturb or remove any cables that you are not instructed to...

Page 234: ...mity to each other The handle with the finger grip on the right removes the upper canister 1 The handle with the finger grip on the left removes the lower canister 2 6 Squeeze them together to release...

Page 235: ...ion canister unless directed to do so by a service procedure v If the power LED is flashing or off it is safe to remove an expansion canister However do not remove an expansion canister unless directe...

Page 236: ...ocking catch clicks into place 11 Reattach the SAS cables Replacing an SFP transceiver When a failure occurs on a single link the SFP transceiver might need to be replaced Even though many of these pr...

Page 237: ...eiver for example you must replace with another longwave SFP transceiver Removing the wrong SFP transceiver might result in loss of data access 2 Remove the optical cable by pressing the release tab a...

Page 238: ...according to the system rating plate v Connect any equipment that will be attached to this product to properly wired outlets v When possible use one hand only to connect or disconnect signal cables v...

Page 239: ...the charge in the backup battery might not be sufficient enough within the partner power supply unit to continue operations without causing a loss of access to the data Wait until the partner battery...

Page 240: ...osure with the handle pointing towards the center of the enclosure Insert the unit in the same orientation as the one that you removed svc00633 Figure 55 Directions for lifting the handle on the power...

Page 241: ...power switch to the power supply unit If required return the power supply Follow all packaging instructions and use any packaging materials for shipping that are supplied to you Replacing a power sup...

Page 242: ...ire water or structural damage v Disconnect the attached power cords telecommunications systems networks and modems before you open the device covers unless instructed otherwise in the installation an...

Page 243: ...les that you are not instructed to remove Ensure that you are aware of the procedures for handling static sensitive devices before you replace the power supply To replace the power supply unit in an e...

Page 244: ...osure with the handle pointing towards the center of the enclosure Insert the unit in the same orientation as the one that you removed svc00633 Figure 57 Directions for lifting the handle on the power...

Page 245: ...eattach the power cable and cable retention bracket 10 Turn on the power switch to the power supply unit If required return the power supply Follow all packaging instructions and use any packaging mat...

Page 246: ...ords telecommunications systems networks and modems before you open the device covers unless instructed otherwise in the installation and configuration procedures v Connect and disconnect cables as de...

Page 247: ...s that are located in the back of the system that you do not inadvertently disturb or remove any cables that you are not instructed to remove Each power supply unit in a control enclosure contains an...

Page 248: ...ry has protective end caps that must be removed prior to use a Remove the battery from the packaging b Remove the end caps c Attach the end caps to both ends of the battery that you removed and place...

Page 249: ...he power cord plug in To release a cable retention bracket perform these steps 1 Unlock the cable retention bracket that is around the end of the power cord 2 Pull the lever next to the black plastic...

Page 250: ...into the slot until the handle starts to move 6 Finish inserting the drive by closing the handle until the locking catch clicks into place svc00612 Figure 60 Unlocking the 3 5 drive svc00613 Figure 61...

Page 251: ...ts in loss of data or access to data Attention Do not leave a drive slot empty Do not remove a drive or drive assembly before you have a replacement available To replace the drive assembly or blank ca...

Page 252: ...pushing it on Replacing a SAS cable This topic describes how to replace a SAS cable Be careful when you are replacing the hardware components that are located in the back of the system that you do no...

Page 253: ...e a control enclosure chassis Note Ensure that you know the type of enclosure chassis that you are replacing The procedures for replacing a control enclosure chassis are different from those procedure...

Page 254: ...n possible use one hand only to connect or disconnect signal cables v Never turn on any equipment when there is evidence of fire water or structural damage v Disconnect the attached power cords teleco...

Page 255: ...from the enclosure column v If you are replacing the enclosure because neither node canister can start retrieve this information after you have completed the replacement a Start the service assistant...

Page 256: ...he cable retention brackets and the power cords from the power supply units 10 Disconnect the data cables for each canister 11 Remove the power supply units from the enclosure 12 Remove the canisters...

Page 257: ...still cannot access the system see Problem Cannot connect to the service assistant on page 193 b Use the Configure enclosure panel c Select the options to Update WWNN 1 Update WWNN 2 Update the machi...

Page 258: ...en removed You can ignore the messages about removing the hardware Verify that the original enclosure is no longer listed in the tree view 30 Add the new enclosure to the system a Select the enclosure...

Page 259: ...that will be attached to this product to properly wired outlets v When possible use one hand only to connect or disconnect signal cables v Never turn on any equipment when there is evidence of fire wa...

Page 260: ...osure which includes host access to GPFS file systems FlashCopy Metro Mirror and Global Mirror access 2 Turn off the power to the enclosure by using the switches on the power supply units 3 Record whi...

Page 261: ...r to the enclosure by using the switches on the power supply units The system records an error that indicates that an enclosure FRU replacement was detected Go to the management GUI to use the fix pro...

Page 262: ...res This section provides general information about hardware and Fibre Channel link issues SAN problem determination The procedures that are provided here help you solve problems on the Storwize V7000...

Page 263: ...placing the SFP transceiver at the switch 5 Contact IBM Support for assistance in replacing the node canister Ethernet iSCSI host link problems If you are having problems attaching to the Ethernet hos...

Page 264: ...e canisters were modified or replaced use the service assistant to verify the levels of software and where necessary to upgrade or downgrade the level of software The system recovery procedure is one...

Page 265: ...for example 01234A6 2 v Quorum drive identifiers in the format enclosure_serial drive slot ID drive 11S serial number 7 characters colon 1 or 2 numbers open square bracket 22 characters close square b...

Page 266: ...578 you must remove system data from those nodes This action acknowledges the data loss and puts the nodes into the required candidate state v Do not attempt to recover the system if you have been ab...

Page 267: ...against them 7 Resolve any hardware errors until the error condition for all nodes in the system is None 8 Ensure that all nodes in the system display a status of candidate When all nodes display a s...

Page 268: ...us If any nodes display error code 550 or 578 remove their system data to place them into candidate status see Procedure Removing system data from a node canister on page 204 4 Select Recover System f...

Page 269: ...relationships that use the offline volumes 2 Run the recovervdisk or recovervdiskbysystem command You can recover individual volumes by using the recovervdisk command You can recover all the volumes...

Page 270: ...application data by using the appropriate backup methods You can maintain your configuration data for the system by completing the following tasks v Backing up the configuration data v Restoring the...

Page 271: ...recovery must be complete The following hardware must be operational hosts Storwize V7000 Unified drives the Ethernet network and the SAN fabric Backing up the system configuration using the CLI You c...

Page 272: ...isting configuration backup and restore files that are on your configuration node canister in the tmp directory svcconfig clear all 5 Issue the following CLI command to back up your configuration svcc...

Page 273: ...er at the start or end of the file names so that you can easily identify these files when you are ready to restore your configuration Issue the following command to rename the backup files that are st...

Page 274: ...250 Storwize V7000 Unified Problem Determination Guide Version...

Page 275: ...select the Advanced group and do the following a Select the number of Heart Beat Interval Days which is used to send small package with general information about the system health The default is seven...

Page 276: ...attached to the local Storwize V7000 Unified file module and that a customer representative is physically present at the connection for the duration of the remote support session To establish the AOS...

Page 277: ...the file download The file is stored in the home root desktop directory 6 Customer When the executable file finishes downloading close the Firefox download window and close the browser The launch scr...

Page 278: ...254 Storwize V7000 Unified Problem Determination Guide Version...

Page 279: ...module user ID that has sufficient authority to run the chrootpwd command successfully If you do not have a user ID or if you have lost the password then use this procedure to recover You must have ph...

Page 280: ...fr now a Wait until the KVM goes past the Grub screen this time and log in when you are prompted to log in b Log in as root with the new password on the KVM 7 Go back to the management CLI to resume t...

Page 281: ...AS ppk The private key is left in the dumps directory 2 Use SCP to copy the private key file to the files directory on the file module which is currently the active management node scp P 1602 dumps NA...

Page 282: ...exportfs a to flush the NFS cache in each file module Verify that the state of each affected file module is healthy and that no new Stale NFS file handle CIMs appear in the alert log after resuming t...

Page 283: ...t Y 9 Press ESC twice to return to the System Configuration and Boot Management screen 10 Scroll down to click Boot Manager and then press Enter 11 Scroll down to click Add Boot Option and then press...

Page 284: ...o 10 enabled _ 8 0 0 2 sdi 8 128 active ready The following output shows that the storage devices are not active root kd27lf6 mgmt002st001 multipath ll mpathq 360050768029180b06000000000000007 dm 8 IB...

Page 285: ...mmand 2 When you complete the service action refer to Health status and recovery on page 22 Recovering from an sshd_data service error Use this procedure to recover from an sshd_data service error Thi...

Page 286: ...he Block tab 3 Run any Next recommended action 4 When all volumes are back online go to Filesystems in the management GUI 5 If any file systems are not online recover them by using the recover a GPFS...

Page 287: ...node name of the passive mode 2 Wait until both nodes show OK in the Connection status column of the output from the CLI command lsnode r 3 Reboot the file module that is the active management node us...

Page 288: ...cations are required until modifications to Site B can be suspended to perform a final replication to Site A to enable Site A to synch up Note Do not use the fullsync option for these incremental repl...

Page 289: ...2010 07 09 15 51 54 05 00 dsmc return code 12 If the file system is managed by Tivoli Storage Manager for Space Management break down the restore into smaller file patterns or subdirectories that con...

Page 290: ...system reboots reinsert the USB flash drive EFSSG4153 The required parameter was not specified Verify that the file actually exists where specified Also verify that the command is passing the correct...

Page 291: ...Table 46 and take the described course of action If the error you see is not listed in this table call the IBM Support Center Follow these guidelines 1 Follow the actions in the order presented 2 Afte...

Page 292: ...e from the system console then use the power button on the front of the server to power cycle the system c After the system reboots restart the upgrade process 01A1 Internal upgrade error Contact your...

Page 293: ...the file tab on the management GUI for an event that could have caused this error and follow the recommended action If there is no obvious event that could have caused this error refer to File module...

Page 294: ...0 kill 01C2 Failed while checking for current running asynchronous jobs Type lsrepl echo If the return code is 0 start the upgrade again If the return code is any other number contact your next level...

Page 295: ...lthy See Checking the GPFS file system mount on each file module on page 162 01DB Failed to stop performance center Contact your next level of support 01DC Failed to configure performance center Conta...

Page 296: ...272 Storwize V7000 Unified Problem Determination Guide Version...

Page 297: ...I login panel You can use this option to navigate to all the panels without manually typing the web addresses v To go to the next frame press Ctrl Tab v To move to the previous frame press Shift Ctrl...

Page 298: ...r edit and press Enter to issue the change command Accessing the publications You can find the HTML version of the IBM Storwize V7000 Unified information at the following website publib boulder ibm co...

Page 299: ...quiries in writing to Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd 1623 14 Shimotsuruma Yamato shi Kanagawa 242 8502 Japan The following paragraph does not apply t...

Page 300: ...f this document should verify the applicable data for their specific environment Information concerning non IBM products was obtained from the suppliers of those products their published announcements...

Page 301: ...ies in the United States and other countries Linux is a registered trademark of Linus Torvalds in the United States other countries or both Microsoft Windows Windows NT and the Windows logo are tradem...

Page 302: ...a Avis de conformit la r glementation d Industrie Canada Cet appareil num rique de la classe A est conforme la norme NMB 003 du Canada Australia and New Zealand Class A Statement Attention This is a C...

Page 303: ...em Warnhinweis versehen werden Warnung Dieses ist eine Einrichtung der Klasse A Diese Einrichtung kann im Wohnbereich Funk St rungen verursachen in diesem Fall kann vom Betreiber verlangt werden angem...

Page 304: ...f China Class A Electronic Emission Statement International Electrotechnical Commission IEC statement This product has been designed and built to comply with IEC Standard 950 Korean Communications Com...

Page 305: ...cal Regulations Pascalstr 100 Stuttgart Germany 70569 Tele 0049 0 711 785 1176 Fax 0049 0 711 785 1283 Email mailto tjahn de ibm com Taiwan Contact Information This topic contains the product service...

Page 306: ...282 Storwize V7000 Unified Problem Determination Guide Version...

Page 307: ...browsers supported 194 C cable retention bracket releasing 225 call home support 251 Canadian electronic emission notice 278 canister expanison 211 identification 196 node 209 replacing 209 211 CD dr...

Page 308: ...54 file module reliability guidelines 55 file module to control enclosure connectivity troubleshooting 27 File module to file module 25 file node hardware indicators 34 file system mounts gpfs troubl...

Page 309: ...practices 3 iSCSI link problems 239 J Japanese electronic emission notice 280 K keyboard accessibility 273 Korean electronic emission statement 280 L LED hardware indicators 34 LEDs file node hardware...

Page 310: ...oller 117 SFP transceiver 212 system 204 system board 149 system data 204 removing and replacing file module components 54 removing and replacing parts 81 replacing 2 5 drive assembly 227 3 5 drive as...

Page 311: ...41 Taiwan contact information 281 electronic emission notice 281 thermal grease 147 thermal material heat sink 142 time synchronizing on file module 171 Tivoli Storage Manager server configuration 168...

Page 312: ...288 Storwize V7000 Unified Problem Determination Guide Version...

Page 313: ......

Page 314: ...Printed in USA GA32 1057 04...

Reviews: