background image

IBM Elastic Storage System 3200

6.1.3

Service Guide

IBM

SC27-9862-00

Содержание Elastic Storage System 3200

Страница 1: ...IBM Elastic Storage System 3200 6 1 3 Service Guide IBM SC27 9862 00 ...

Страница 2: ...roduct number 5765 DME IBM Spectrum Scale Data Access Edition for IBM ESS product number 5765 DAE IBM welcomes your comments see the topic How to submit your comments on page xi When you send information to IBM you grant IBM a nonexclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you Copyright International Business Machines ...

Страница 3: ...ler 9 Removing and replacing a power module 12 Removing and replacing a bezel kit 17 Miscellaneous equipment specification MES instructions 21 ESS 3200 storage drives concurrent MES upgrade 21 Formatting drives example 27 Manually restarting GPFS on the Enterprise Storage Server 3200 canisters example 29 Chapter 2 Part Listings 31 CRU part number list 31 FRU part number list 31 Accessibility featu...

Страница 4: ...iv ...

Страница 5: ...ve carrier assembly 6 10 Displaying NVMe drive replacement part 6 11 Displaying physical disks state 7 12 Unlocking the drive and release latch 8 13 Removing NVMe drive 8 14 Releasing catch 9 15 Displaying drive slots 10 16 Displaying drive carrier assembly 10 17 Displaying drive filler replacement kit 10 18 Unlocking the drive slot and release latch 11 19 Removing drive filler 11 20 Releasing cat...

Страница 6: ...27 Showing enclosure front view with shipping screws 17 28 Displaying left ear cap front view 18 29 Replacing left ear cap 18 30 Displaying right ear cap front view 19 31 Replacing right ear cap 19 32 Pressing release tab 20 33 Securing the enclosure 20 34 Displaying IBM brand label location 20 vi ...

Страница 7: ...Tables 1 Conventions x 2 IBM Elastic Storage System 3200 power module status LEDs 14 3 CRU Part Numbers 31 4 FRU Part Numbers 31 vii ...

Страница 8: ...viii ...

Страница 9: ...ed Service Guide This unit provides ESS 3200 information including servicing and parts listings System administrators and IBM support team Problem Determination Guide This unit provides ESS 3200 information including events replacing servers issues maintenance procedures and troubleshooting System administrators and IBM support team Command Reference This unit provides information about ESS comman...

Страница 10: ...mation UNIX file name conventions are used throughout this information Table 1 Conventions Convention Usage bold Bold words or characters represent system elements that you must use literally such as commands flags values and selected menu options Depending on the context bold typeface sometimes represents path names directories or file names bold underlined bold underlined keywords are defaults T...

Страница 11: ... item Ellipses indicate that you can repeat the preceding item one or more times In synopsis statements vertical lines separate a list of choices In other words a vertical line means Or In the left margin of the document vertical lines indicate technical changes to the information How to submit your comments To contact the IBM Spectrum Scale development organization send your comments to the follo...

Страница 12: ...xii IBM Elastic Storage System 3200 Service Guide ...

Страница 13: ...em health check by using the SSR utility Note If you encounter an error or an unexpected result contact the next level support Note The commandless disk replacement automates the process of replacing a failed or bad drive with a new drive To check whether the commandless disk replacement option is enabled submit the following command mmlsconfig grep enableAutomaticDiskReplacement The output appear...

Страница 14: ...y housing an NVMe drive Figure 2 Displaying drive carrier assembly Fully assembled drive carrier assembly with one of the following options exactly matches the other drives that are installed in the same enclosure Part number 01LL727 3 84 TB 2 5 PCIe Gen4 NVMe Flash drive Part number 01LL728 7 68 TB 2 5 PCIe Gen4 NVMe Flash drive Part number 01LL729 15 36 TB 2 5 PCIe Gen4 NVMe Flash drive Part num...

Страница 15: ...ent details and proposed solution Event name gnr_pdisk_replaceable Problem The state of a physical disk is changed to replaceable Solution Replace the disk The DMP automatically launches in corresponding mode depending on situation You can launch this DMP from the pages in the GUI and follow the wizard to release one or more disks Monitoring Hardware page Select Replace Broken Disks from the Actio...

Страница 16: ... replacement drive into the slot a Press the release catch on the drive carrier in the direction of the arrow to open the handle as identified in Figure 7 on page 4 The cam releases from the locked position Figure 7 Releasing catch b Grab the frame beneath the touch point and gently push the drive carrier assembly into the drive bay until the carrier handle engages c Press the release handle downw...

Страница 17: ...ommand as a root user If IBM service personnel are running this task it requires coordination with the customer for steps that require root user access as part of the procedure No tools are needed to complete this task Do not remove or loosen any screws When you replace this part you must follow recommended procedures for handling electrostatic discharge ESD sensitive devices IBM Elastic Storage S...

Страница 18: ...Gen3 NVMe Flash drive Figure 10 Displaying NVMe drive replacement part Refer to the following steps to remove and replace a faulty NVMe drive 1 Confirm the failed drive location from the fault drive LED You can identify the failed drive by the amber fault LED on the drive carrier If the fault LED is lit on a drive it is safe to replace the drive Alternatively locate unhealthy drives in the managem...

Страница 19: ...xample the BB01L recovery group needs service mmvdisk recoverygroup list needs user recovery group active current or master server service vdisks remarks BB01L yes server01 gpfs net yes 3 BB01R yes server02 gpfs net no 3 It happens when the number of failed pdisks in one of the recovery group s declustered arrays reaches or exceeds the replacement threshold for the declustered array The following ...

Страница 20: ...roup BB01L pdisk e2s11 The drive that is associated with the pdisk name in the previous command must have the amber LED ON to indicate it is safe to remove this drive 3 Remove the drive which has amber LED ON a Press the blue touchpoint to unlock the latching handle as shown in Figure 12 on page 8 Unlocking the drive and release latch Figure 12 Unlocking the drive and release latch b Lift the hand...

Страница 21: ...sk Pdisk e2s11 of RG BB01L successfully replaced mmvdisk Carrier resumed Repeat the steps 2 through 5 for each pdisk that needs to be replaced as marked in the output of the mmvdisk pdisk list recovery group all RgName RgName replace command 6 Run common system health check by using the SSR utility Use the mmvdisk command to check and confirm that no failed disk is reported Removing and replacing ...

Страница 22: ...rrier assembly is composed of the drive or drive blank and a drive carrier and are used to provide for controlled insertion into and extraction from the storage enclosure Drive carrier assemblies are installed from the front of the enclosure which simplifies service access Closing the drive carrier handle ensures complete seating of the connectors Figure 16 Displaying drive carrier assembly Servic...

Страница 23: ...e enclosure as shown in Figure 19 on page 11 Figure 19 Removing drive filler 2 Replace with the drive filler replacement part a Press the release catch on the drive carrier in the direction of the arrow to open the handle as identified in Figure 20 on page 11 The cam releases from the locked position Figure 20 Releasing catch b Hold the drive blank the correct way up as shown in Figure 21 on page ...

Страница 24: ... with the face of the enclosure Figure 22 Inserting the drive filler Removing and replacing a power module Refer to the service procedure to remove and replace a power module of an IBM Elastic Storage System 3200 enclosure The following steps are the high level flow of the procedure 1 Follow the appropriate steps to prepare for a service action to release the CMA arm 2 Confirm the faulty power mod...

Страница 25: ...e redundancy The power module provides 12 V power and 3 3 V standby power to the system and includes an internal cooling fan Refer to Figure 23 on page 13 to view the location of each power module on the rear side of the enclosure When you remove or replace this part you must follow recommended procedures for handling ESD sensitive devices CAUTION The power module contains no power switch for Off ...

Страница 26: ...ule LED Each power module of an IBM Elastic Storage System 3200 enclosure has a bi color LED as shown in Figure 24 on page 14 The LED indicates the status of the power module as described in Table 2 on page 14 Figure 24 Displaying fan LED Table 2 IBM Elastic Storage System 3200 power module status LEDs Green Color Amber Color Behavior ON OFF Power module ON and OK OFF OFF No AC power to all power ...

Страница 27: ...to the faulty power module Locations of power jacks where the power cables are plugged in are highlighted in the following figure Figure 25 Displaying power module jacks Note Unplug the only power cable that is connected to the specific power module left or right position you intend to replace on the rear of the enclosure 4 Remove the faulty power module a Pull the finger handle until it is at the...

Страница 28: ... power module cams locks into the place d Test to ensure that the power module is properly installed Pull the finger handle slightly outward to verify that the power module is locked in place If the power module easily removed from the enclosure without pushing the release tab remove the power module and repeat steps 5 a on page 16 through 5 c on page 16 and then retest the installation 6 Reconnec...

Страница 29: ... serviceable position Take precautions to avoid damage from static electricity When you remove or replace parts throughout this process you must follow recommended procedures for handling ESD sensitive devices To replace the bezel kit for IBM Elastic Storage System 3200 you require the following tools 2 Phillips driver to loosen the shipping screws and remove bezel ear caps The following parts are...

Страница 30: ...the left ear cap a First remove the left ear cap from the back side of the left flange by removing the two screws as highlighted in Figure 28 on page 18 by using a Phillips driver Figure 28 Displaying left ear cap front view b Next install the left ear cap part 01LL677 as shown in Figure 29 on page 18 Figure 29 Replacing left ear cap 4 Remove and replace the right ear cap a Remove the right ear ca...

Страница 31: ...1 Replacing right ear cap 5 Return enclosure back to the rack position With the enclosure in the serviceable position press the release tab on each inner chassis member rail see Figure 32 on page 20 to release the rail from the serviceable position Simultaneously push the enclosure into the rack Chapter 1 Servicing customer tasks 19 ...

Страница 32: ... the enclosure as demonstrated in Figure 33 on page 20 Figure 33 Securing the enclosure 7 Place the IBM brand label on the right ear cap Place the IBM brand label onto the right ear cap within the square area as highlighted in Figure 34 on page 20 Figure 34 Displaying IBM brand label location If an MTM SN label is needed to be placed on the left ear cap contact IBM Quality Hotline for more informa...

Страница 33: ...n If you do not want to lose the quorum move the quorum to different servers during this procedure It is suggested to wear an ESD wrist band when you work on the hardware for example inserting NVMe drives MES upgrade overview GPFS uses preferentially the new network shared disks NSDs to store data of a new file system GPFS has four new NSDs that are the same as the four original NSDs and the workl...

Страница 34: ...ars empty but should have an SSD To rectify the error reseat the drive in that slot and rerun the command mmvdisk server list disk topology node class ess3200_x86_64_mmvdisk_78E400K node needs matching number server attention metric disk topology 6 ess3200rw3a hs gpfs ess no 100 100 ESS 3200 FN1 24 NVMe 7 ess3200rw3b hs gpfs ess no 100 100 ESS 3200 FN1 24 NVMe Both canisters must show a 24 NVMe di...

Страница 35: ... id serial number level firmware location drive a8241014065c 78E400K SN5ASN5A SN5ASN5A ess3200rw3a hs Rack Bodhi Rack 1520 U25 26 Enclosure 5141 FN1 78E400K Drive 10 drive a8241014065c 78E400K SN5ASN5A SN5ASN5A ess3200rw3a hs Rack Bodhi Rack 1520 U25 26 Enclosure 5141 FN1 78E400K Drive 11 drive a8241014065c 78E400K SN5ASN5A SN5ASN5A ess3200rw3a hs Rack Bodhi Rack 1520 U25 26 Enclosure 5141 FN1 78E...

Страница 36: ...r 1 Flash 1 3 84 TB 3 84 TB 4 KiB 8 B SN5ASN5A dev nvme2n1 S54KNE0NA00269 3 84TB NVMe Tier 1 Flash 1 3 84 TB 3 84 TB 4 KiB 0 B SN5ASN5A dev nvme3n1 S54KNE0NA00259 3 84TB NVMe Tier 1 Flash 1 3 84 TB 3 84 TB 4 KiB 0 B SN5ASN5A dev nvme4n1 S54KNE0NA00485 3 84TB NVMe Tier 1 Flash 1 3 84 TB 3 84 TB 4 KiB 0 B SN5ASN5A dev nvme5n1 S54KNE0NA00255 3 84TB NVMe Tier 1 Flash 1 3 84 TB 3 84 TB 4 KiB 0 B SN5ASN...

Страница 37: ...E400K e1s07 DA1 2 3576 GiB 2270 GiB 3 84TB NVMe Tie ok ess3200_78E400K e1s08 DA1 2 3576 GiB 2270 GiB 3 84TB NVMe Tie ok ess3200_78E400K e1s09 DA1 2 3576 GiB 2272 GiB 3 84TB NVMe Tie ok ess3200_78E400K e1s10 DA1 2 3576 GiB 2270 GiB 3 84TB NVMe Tie ok ess3200_78E400K e1s11 DA1 2 3576 GiB 2268 GiB 3 84TB NVMe Tie ok ess3200_78E400K e1s12 DA1 2 3576 GiB 2270 GiB 3 84TB NVMe Tie ok ess3200_78E400K e1s1...

Страница 38: ...s will be created in vdisk set vs_fs3200_2 mmvdisk mmcrvdisk I Processing vdisk RG003LG001VS009 mmvdisk mmcrvdisk I Processing vdisk RG003LG002VS009 mmvdisk mmcrvdisk I Processing vdisk RG003LG003VS009 mmvdisk mmcrvdisk I Processing vdisk RG003LG004VS009 mmvdisk Created all vdisks in vdisk set vs_fs3200_2 mmvdisk mmcrnsd Processing disk RG003LG001VS009 mmvdisk mmcrnsd Processing disk RG003LG002VS0...

Страница 39: ... group Ensure that the drives are not part of an existing recovery group If you format a drive that is a part of the existing recovery group will lead to loss of data 1 Create a list of drives as DONOTFORMAT list that are part of an existing recovery group 2 Create a list of drives as FORMAT list that are NOT part of an existing recovery group and need to be formatted 3 Ensure that drives in the F...

Страница 40: ...lash 1 3 84 TB 3 84 TB 4 KiB 0 B SN1OSN1O dev nvme14n1 S43RNX0M903001 3 84TB NVMe G3 Tier 1 Flash 1 3 84 TB 3 84 TB 4 KiB 0 B SN1OSN1O dev nvme15n1 S43RNX0M903025 3 84TB NVMe G3 Tier 1 Flash 1 3 84 TB 3 84 TB 4 KiB 0 B SN1OSN1O dev nvme16n1 S43RNX0M903027 3 84TB NVMe G3 Tier 1 Flash 1 3 84 TB 3 84 TB 4 KiB 0 B SN1OSN1O dev nvme17n1 S43RNX0M903215 3 84TB NVMe G3 Tier 1 Flash 1 3 84 TB 3 84 TB 4 KiB...

Страница 41: ...luster 10 Number of remote nodes joined in this cluster 0 Number of quorum nodes defined in the cluster 6 Number of quorum nodes active in the cluster 6 Quorum 4 Quorum achieved 2 Shut down GPFS on a canister by issuing the following command mmshutdown N node name 3 Start GPFS on a canister by issuing the following command mmstartup N node name 4 Check the state of GPFS on both canisters by issuin...

Страница 42: ...30 IBM Elastic Storage System 3200 Service Guide ...

Страница 43: ... PCIe Gen3 NVMe Flash Drive 01LL831 Drive Filler 01LL731 FRU part number list The FRU part numbers are listed in the table FRU Part Number Table 4 FRU Part Numbers Description Part Number Complete rail kit 01LL740 Cable Management Assembly CMA 01LL743 Fan module 01LL738 Canister top lid FRU kit 01LL753 Server module canister FRU kit 01LL660 Trusted Platform Module TPM FRU kit 01LL739 64GB memory D...

Страница 44: ...Number Canister Screw Set Kit 01LL788 Note This part is required when servicing the PCIe riser PCIe adapter DIMM coin battery TPM and boot drives The part is already included in all canister part FRU kits 32 IBM Elastic Storage System 3200 Service Guide ...

Страница 45: ...only used by screen readers Keys that are discernible by touch but do not activate just by touching them Industry standard devices for ports and connectors The attachment of alternative input and output devices IBM Documentation and its related publications are accessibility enabled Keyboard navigation This product uses standard Microsoft Windows navigation keys IBM and accessibility See the IBM H...

Страница 46: ...34 IBM Elastic Storage System 3200 Service Guide ...

Страница 47: ...S MACHINES CORPORATION PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND EITHER EXPRESS OR IMPLIED INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF NON INFRINGEMENT MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE Some states do not allow disclaimer of express or implied warranties in certain transactions therefore this statement may not apply to you This information could incl...

Страница 48: ...s in any form without payment to IBM for the purposes of developing using marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written These examples have not been thoroughly tested under all conditions IBM therefore cannot guarantee or imply reliability serviceability or function of these pr...

Страница 49: ...ks of these publications or reproduce distribute or display these publications or any portion thereof outside your enterprise without the express consent of IBM Rights Except as expressly granted in this permission no other permissions licenses or rights are granted either express or implied to the Publications or any information data software or other intellectual property contained therein IBM r...

Страница 50: ...38 IBM Elastic Storage System 3200 Service Guide ...

Страница 51: ...ee central processor complex CPC central processor complex CPC A physical collection of hardware that consists of channels timers main storage and one or more central processors cluster A loosely coupled collection of independent systems or nodes organized into a network for the purpose of sharing resources and communicating with each other See also GPFS cluster cluster manager The node that monit...

Страница 52: ...a SAS expander that attaches to the storage enclosure drives In the case of multiple drawers in a storage enclosure the ESM attaches to drawer control modules ESM See environmental service module ESM F failback Cluster recovery from failover following repair See also failover failover 1 The assumption of file system duties by another node when a node fails 2 The process of transferring all control...

Страница 53: ... qualified domain name FQDN The complete domain name for a specific computer or host on the Internet The FQDN consists of two parts the hostname and the domain name G GPFS cluster A cluster of nodes defined as being available for use by GPFS file systems GPFS portability layer The interface module that each installation must build for its specific hardware platform and Linux distribution GPFS Stor...

Страница 54: ...ting system that contains programs for such tasks as input output management and control of hardware and the scheduling of user tasks L LACP See Link Aggregation Control Protocol LACP Link Aggregation Control Protocol LACP Provides a way to control the bundling of several physical ports together to form a single logical channel logical partition LPAR A subset of a server s hardware resources virtu...

Страница 55: ... An individual operating system image within a cluster Depending on the way in which the computer system is partitioned it can contain one or more nodes In a Power Systems environment synonymous with logical partition node descriptor A definition that indicates how ESS uses a node Possible functions include manager node client node quorum node and non quorum node node number A number that is gener...

Страница 56: ...ther without involving either one s operating system This permits high throughput low latency networking which is especially useful in massively parallel computer clusters RGD See recovery group data RGD remote key management server RKM server A server that is used to store master encryption keys RG See recovery group RG recovery group data RGD Data that is associated with a recovery group RKM ser...

Страница 57: ...T TCP See Transmission Control Protocol TCP Transmission Control Protocol TCP A core protocol of the Internet Protocol Suite that provides reliable ordered and error checked delivery of a stream of octets between applications running on hosts communicating over an IP network V VCD See vdisk configuration data VCD vdisk A virtual disk vdisk configuration data VCD Configuration data that is associat...

Страница 58: ...46 IBM Elastic Storage System 3200 Service Guide ...

Страница 59: ...ts xi D documentation on web ix I information overview ix L license inquiries 35 N notices 35 O overview of information ix P patent information 35 preface ix R resources on web ix S submitting xi T trademarks 36 W web documentation ix resources ix Index 47 ...

Страница 60: ...48 IBM Elastic Storage System 3200 Service Guide ...

Страница 61: ...ts xi D documentation on web ix I information overview ix L license inquiries 35 N notices 35 O overview of information ix P patent information 35 preface ix R resources on web ix S submitting xi T trademarks 36 W web documentation ix resources ix Index 49 ...

Страница 62: ...50 IBM Elastic Storage System 3200 Service Guide ...

Страница 63: ......

Страница 64: ...IBM Product Number 5765 DME 5765 DAE SC27 9862 00 ...

Отзывы: