background image

Table 7. Server events (continued)

Event

Event Type

Severi

ty

Message

Description

Cause

User Action

server_power_supply_ov_line_ 12V_ok STATE_CHANGE

INFO

OV Line 12V of

Power Supply

{0} is ok.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part is ok.

None.

server_power_supply_ov_line_

12V_failed

STATE_CHANGE

ERRO

R

OV Line 12V of

Power Supply

{0} failed.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part failed.

None.

server_power_supply_uv_line_ 12V_ok STATE_CHANGE

INFO

UV Line 12V of

Power Supply

{0} is ok.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part is ok.

None.

server_power_supply_uv_line_

12V_failed

STATE_CHANGE

ERRO

R

UV Line 12V of

Power Supply

{0} failed.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part failed.

None.

server_power_supply_aux_line_

12V_ok

STATE_CHANGE

INFO

AUX Line 12V of

Power Supply

{0} is ok.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part is ok.

None.

server_power_supply_aux_line_

12V_failed

STATE_CHANGE

ERRO

R

AUX Line 12V of

Power Supply

{0} failed.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part failed.

None.

server_power_supply_ fan_ok

STATE_CHANGE

INFO

Fan of Power

Supply {0} is ok.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part is ok.

None.

server_power_supply_ fan_failed

STATE_CHANGE

ERRO

R

Fan of Power

Supply {0}

failed.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part failed.

None.

server_power_supply_ voltage_ok

STATE_CHANGE

INFO

Voltage of

Power Supply

{0} is ok.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part is ok.

None.

server_power_supply_ voltage_failed

STATE_CHANGE

ERRO

R

Voltage of

Power Supply

{0} is not ok.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part failed.

None.

server_power_ supply_ok

STATE_CHANGE

INFO

Power Supply

{0} is ok.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part is ok.

None.

server_power_ supply_failed

STATE_CHANGE

ERRO

R

Power Supply

{0} failed.

The GUI

checks the

hardware

state using

xCAT.

The hardware

part failed.

None.

pci_riser_temp_ok

STATE_CHANGE

INFO

The

temperature of

PCI Riser {0} is

ok. ({1})

The GUI

checks the

hardware

state using

xCAT.

The hardware

part is ok.

None.

Chapter 1. Events  13

Содержание Elastic Storage System 3000

Страница 1: ...IBM Elastic Storage System 3000 Version 6 0 1 Service Guide IBM SC28 3158 00...

Страница 2: ...uct number 5765 DME IBM Spectrum Scale Data Access Edition for IBM ESS product number 5765 DAE IBM welcomes your comments see the topic How to submit your comments on page xi When you send information...

Страница 3: ...nd replacing a drive blank 23 Removing and replacing a power supply unit 23 Removing and replacing a power interposer 26 Miscellaneous equipment specification MES instructions 28 ESS 3000 storage driv...

Страница 4: ...Index 63 iv...

Страница 5: ...nk orientation 23 6 Details of Power Supply Units in the management GUI 24 7 Features of a power supply unit 25 8 Removing the power supply unit 26 9 Sliding out the power interposer 27 10 Removing a...

Страница 6: ...vi...

Страница 7: ...ster component 1 4 Events for the Enclosure component 4 5 Events for the physical disk component 8 6 Events for the Recovery group component 11 7 Server events 12 8 Events for the virtual disk compone...

Страница 8: ...viii...

Страница 9: ...ed with the operating systems on which each IBM Spectrum Scale cluster is based Service Guide This unit provides ESS 3000 information including events servicing and parts listings System administrator...

Страница 10: ...lic Italic words or characters represent variable values that you must supply Italics are also used for information unit titles for the first use of a glossary term and for general emphasis in text ke...

Страница 11: ...How to submit your comments To contact the IBM Spectrum Scale development organization send your comments to the following email address scale us ibm com About this information xi...

Страница 12: ...xii IBM Elastic Storage System 3000 Service Guide...

Страница 13: ...red array 0 is ok The declustered array state is ok N A N A gnr_array_unknown STATE_CHANGE WARNIN G GNR declustered array 0 is in unknown state The declustered array state is unknown N A N A gnr_array...

Страница 14: ...ssessment returns OK The tsplatformstat a command returns a PASSED in the selfAssessment field for the bootdrive N A can_fan_failed STATE_CHANG E WARNING Fan 0 is failed The fan state is failed The mm...

Страница 15: ...in ess3kplt command returned an InspectionPasse d unequal to True value Check for specific events related to CPUs by using the mmhealth command Inspect the output of the ess3kplt command for details c...

Страница 16: ...ct the output of the mmlsenclosure all L command for the referenced canister pair_canister_visible STATE_CHANG E INFO Pair canister 0 is visible Successfully get the state of the pair canister The mml...

Страница 17: ...ive is correct N A N A drive_firmware_wrong STATE_CHANGE WARNIN G The firmware level of drive 0 is wrong The firmware level of the drive is wrong N A Check the installed firmware level using the mmlsf...

Страница 18: ...ANGE INFO ESM 0 is ok The ESM state is ok N A N A expander_absent STATE_CHANGE WARNIN G expander 0 is absent The expander is absent N A N A expander_failed STATE_CHANGE ERROR expander 0 is failed The...

Страница 19: ...iled The temperature sensor I2C bus has failed N A N A temp_high_critical STATE_CHANGE WARNIN G Temperature sensor 0 measured a high temperature value The temperature has exceeded the actual high crit...

Страница 20: ...voltage sensor state is failed N A N A voltage_sensor_ok STATE_CHANGE INFO Voltage sensor 0 is ok The voltage sensor state is ok N A N A Physical disk events The following table lists the events that...

Страница 21: ...tration commands like mmdeldisk The mmls pdis k com mand displa ys main tena nce user condi tion for the disk Complete the maintenance action Contact IBM support if you are not sure how to solve this...

Страница 22: ...cale configuration was not detected A GNR pdisk listed in the IBM Spect rum Scale confi gurati on as moun ted befor e is not found This could be a valid situat ion Run the mmlspdisk command to verify...

Страница 23: ...roup events The following table lists the events that are created for the Recovery group component Table 6 Events for the Recovery group component Event Event Type Severity Message Description Caus e...

Страница 24: ...hardware state using xCAT The hardware part is ok None cpu_temperature_ok STATE_CHANGE INFO CPU 0 temperature is normal 1 The GUI checks the hardware state using xCAT The hardware part is ok None cpu_...

Страница 25: ...R AUX Line 12V of Power Supply 0 failed The GUI checks the hardware state using xCAT The hardware part failed None server_power_supply_ fan_ok STATE_CHANGE INFO Fan of Power Supply 0 is ok The GUI ch...

Страница 26: ...s the hardware state using xCAT The hardware part is ok None pci_failed STATE_CHANGE ERRO R PCI 0 failed The GUI checks the hardware state using xCAT The hardware part failed None fan_zone_ok STATE_CH...

Страница 27: ..._ok STATE_CHANGE INFO All Power Supply Configurations of server 0 are ok The GUI checks the hardware state using xCAT The hardware part is ok None server_ps_conf_failed STATE_CHANGE ERRO R At least on...

Страница 28: ...ecks the hardware state using xCAT The hardware part is ok None server_planar_failed STATE_CHANGE ERRO R Planar state of server 0 is unhealthy the voltage is too low or too high 1 The GUI checks the h...

Страница 29: ...The vdisk state is degraded N A N A gnr_vdisk_found INFO_ADD_ENTI TY INFO GNR vdisk 0 was found A GNR vdisk listed in the IBM Spectrum Scale configuration was detected N A N A gnr_vdisk_offline STATE_...

Страница 30: ...18 IBM Elastic Storage System 3000 Service Guide...

Страница 31: ...ive You can also locate unhealthy drives in the management GUI either from the Storage Physical Disks page or from the list of events that are available under the Monitoring Events page You can also s...

Страница 32: ...rity FRU type location BB01L e2s11 1 15 00W1240 Enclosure 2 Drive 11 BB01L e3s01 1 15 00W1240 Enclosure 3 Drive 1 mmvdisk A lower priority value means a higher need for replacement Preparing disks for...

Страница 33: ...ing the drive 1 Ensure that the LED indicators are at the top of the drive 2 Press the blue touchpoint to unlock the latching handle on the new drive 3 Slide the new drive into the node canister as sh...

Страница 34: ...sh replacing pdisk e2s11 with the new physical disk by running the following command mmvdisk pdisk replace recovery group BB01L pdisk e2s11 mmvdisk mmvdisk Preparing a new pdisk for use may take many...

Страница 35: ...ked before you remove the existing drive slot filler No tools are required to complete this task Do not remove or loosen any screws 1 Unpack the replacement drive slot filler from its packaging Removi...

Страница 36: ...from the Monitoring Hardware Details page as shown in the Figure 6 on page 24 Figure 6 Details of Power Supply Units in the management GUI Two sets of power supply units are available for each enclos...

Страница 37: ...SURE DEGRADED 1 day ago power_supply_failed 78E021A 78E021A DEGRADED 1 day ago power_supply_failed 78E021A Event Parameter Severity Active Since Event Message power_supply_failed 78E021A WARNING Now P...

Страница 38: ...the midplane It can only be removed after its PSU is removed from the rear of the enclosure Before you remove or replace a power interposer review the following guidelines for this procedure Ensure t...

Страница 39: ...poser out until it is clear of the enclosure rear as shown in Figure 10 on page 27 Figure 10 Removing a power interposer Replacing the power interposer 4 Identify the correct empty power slot where th...

Страница 40: ...t be at the ESS 5 3 5 1 or ESS 3000 6 0 0 1 level If the setup has any protocol nodes these nodes must also be upgraded to ESS 5 3 5 1 levels underlying code IBM Spectrum Scale 5 0 4 2 verified by usi...

Страница 41: ...the automount is disabled on the file systems and the remote clusters 8 Issue the mmshutdown command on the ESS 3000 canister servers 9 Power off the ESS 3000 by removing the cables that are at the b...

Страница 42: ...he third adapter to each of the server canisters of ESS 3000 The adapter options to choose from include EC64 InfiniBand and EC67 Ethernet Objectives Install the new adapter pair one in each server nod...

Страница 43: ...ds 1 To get the information about the interfaces issue the following command ip a Copy and paste the interfaces information of existing adapters into a note for a later comparison 2 To get information...

Страница 44: ...mshutdown N Node server names separated by a comma a Ensure that the node servers associated with the target ESS 3000 are shut down by issuing the following command mmgetstate a b Do the following ext...

Страница 45: ...the existing network master bond customer task a Log in to each canister and issue the following commands 1 To get the information about the interfaces issue the following command ip a Ensure that the...

Страница 46: ...ices are listed e Do the following extra steps on only one node canister if MES is for EC64 InfiniBand 1 To update the verbs port list first start GPFS manually 2 To identify the node class name assoc...

Страница 47: ...wing command mmlsmount filesystem L c Confirm that one or more file systems are mounted by issuing the following command mmlsmount 10 Do health check by issuing the following command and resume I O be...

Страница 48: ...mount settings on both server canisters during the MES process customer task a Log in as root to each canister and issue the following commands 1 To get the information about a GPFS cluster issue the...

Страница 49: ...stic Storage System 3000 Service Guide or the Planning for hardware chapter of the IBM Elastic Storage System 3000 Hardware Planning and Installation Guide 6 Power on ESS 3000 and do basic checks SSR...

Страница 50: ...le net no 754 GiB 75 GiB 131072 22 ess3k5b ib example net no 754 GiB 75 GiB 131072 Here you can see that the pagepool is less than 25 of physical memory c To change the pagepool percentage check that...

Страница 51: ...that is associated with the target ESS 3000 by issuing the following command mmlsconfig 9 Mount the file system customer task a Mount each file system individually by issuing the following command mm...

Страница 52: ...the resizing the original file system data goes to the four original NSDs Consider the necessity of restriping and the current demands on the system New data that is added to the file system is correc...

Страница 53: ...the canisters to update the drive firmware mmchfirmware type drive a After the mmchfirmware command completes verify that the drive firmware levels are correct by issuing the following command again...

Страница 54: ...g nodes mmvdisk ess3kb ib example net Important This command automatically stops and restarts GPFS on each canister server in a serial fashion by using the recycle 1 option If you do not want to stop...

Страница 55: ...file system and attributes vs_ess3k_1 4 6152 GiB 7820 GiB no DA1 8 2p 4 MiB dataAndMetadata system declustered capacity all vdisk sets defined recovery group array type total raw free raw free in the...

Страница 56: ...n manually stop and start GPFS to solidify the nodes configuration changes on both canisters For configuration changes to take effect restart GPFS on one canister at a time and ensure that at least on...

Страница 57: ...GPFS is in the active state on both canisters mmgetstate N this ESS 3000 node class A sample output is as follows Node number Node name GPFS state 21 ess3ka ib active 22 ess3kb ib active 6 Repeat the...

Страница 58: ...46 IBM Elastic Storage System 3000 Service Guide...

Страница 59: ...3 84 TB 2 5 NVMe Flash drive 01LL513 7 68 TB 2 5 NVMe Flash drive 01LL514 15 36 TB 2 5 NVMe Flash drive 01LL515 Left Brand Bezel 01LL519 FRU part number list The FRU part numbers are listed in the ta...

Страница 60: ...d in the table Cable part number list Table 11 Cable Part Numbers Description Part Number IB cbl 2 M 0000000RX861 CR2032 coin cell 0000000RY543 1 M EDR IB COPPER CABLE TRANSCEIVER QSFP QSFP 0000000WT0...

Страница 61: ...ve copper 100Gb Ethernet cable 0000001FT718 1M QSFP28 passive copper 100Gb Ethernet cable 0000001FT719 1 5M QSFP28 passive copper 100Gb Ethernet cable 0000001FT720 2M QSFP28 passive copper 100Gb Ether...

Страница 62: ...002CL470 ELC5 Power Cable Drawer to IBM PDU C13 C20 250V 10A 0000002EA542 6665 Power Cablem 9 2 ft Drawer to IBM PDU C13 C20 250V 10A 0000039M5392 6672 Power Cord M 6 5 foot Drawer to IBM PDU C13 C14...

Страница 63: ...touch but do not activate just by touching them Industry standard devices for ports and connectors The attachment of alternative input and output devices IBM Knowledge Center and its related publicat...

Страница 64: ...52 IBM Elastic Storage System 3000 Service Guide...

Страница 65: ...ACHINES CORPORATION PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND EITHER EXPRESS OR IMPLIED INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF NON INFRINGEMENT MERCHANTABILITY OR FI...

Страница 66: ...cation programs in source language which illustrate programming techniques on various operating platforms You may copy modify and distribute these sample programs in any form without payment to IBM fo...

Страница 67: ...entral processor complex CPC central processor complex CPC A physical collection of hardware that consists of channels timers main storage and one or more central processors cluster A loosely coupled...

Страница 68: ...system composed of one or more building blocks encryption key A mathematical value that allows components to verify that they are in communication with the expected server Encryption keys are based on...

Страница 69: ...as a unit for balancing workload across a cluster See also dependent fileset independent fileset fileset snapshot A snapshot of an independent fileset plus all dependent filesets flexible service proc...

Страница 70: ...uster IP See Internet Protocol IP IP over InfiniBand IPoIB Provides an IP network emulation layer on top of InfiniBand RDMA networks which allows existing applications to run over InfiniBand networks...

Страница 71: ...unit MTU N Network File System NFS A protocol developed by Sun Microsystems Incorporated that allows any host in a network to gain access to another host or netgroup and their file directories Networ...

Страница 72: ...tem data when a failure has occurred Recovery can involve reconstructing data or providing alternative routing through a different server recovery group RG A collection of disks that is set up by ESS...

Страница 73: ...n that results from them SSH See secure shell SSH STP See Spanning Tree Protocol STP symmetric multiprocessing SMP A computer architecture that provides fast performance by making multiple processors...

Страница 74: ...62 IBM Elastic Storage System 3000 Service Guide...

Страница 75: ...ery group events 11 server events 12 virtual disk events 17 I IBM Elastic Storage System 3000 28 IBM Spectrum Scale events 1 4 8 11 12 17 RAS events 1 4 8 11 12 17 information overview ix L license in...

Страница 76: ...64 IBM Elastic Storage System 3000 Service Guide...

Страница 77: ...ery group events 11 server events 12 virtual disk events 17 I IBM Elastic Storage System 3000 28 IBM Spectrum Scale events 1 4 8 11 12 17 RAS events 1 4 8 11 12 17 information overview ix L license in...

Страница 78: ...66 IBM Elastic Storage System 3000 Service Guide...

Страница 79: ......

Страница 80: ...IBM Product Number 5765 DME 5765 DAE SC28 3158 00...

Отзывы: