background image

ClusterStor™ MMU/AMMU Addition

(3.3)

H-6167

Summary of Contents for ClusterStor H-6167

Page 1: ...ClusterStor MMU AMMU Addition 3 3 H 6167 ...

Page 2: ...lete the MMU AMMU Addition Procedure 21 Sample Output for configure_mds 27 Tips and Tricks 32 Failed Recovery or MMU AMMU Added Out of Sequence 32 Node Stuck in Discovery Prompt 33 Reconnect a Screen Session 34 Check Puppet Certificate 35 SSH Connection Refused Puppet Certificates 35 SSH Connection Refused MDS OSS Node Not Fully Booted 35 No Free Arrays Wrong md Assignment 36 Check ARP Local Hosts...

Page 3: ...n Publication Title Date Updates ClusterStor MMU AMMU Addition 3 3 H 6167 December 2019 Release of 3 3 version ClusterStor MMU AMMU Addition 3 2 H 6167 August 2019 Release of 3 2 version ClusterStor MMU AMMU Addition 3 1 H 6167 February 2019 Release of 3 1 version Scope and Audience The procedure in this publication is intended to be performed only by qualified Cray personnel Typographic Conventio...

Page 4: ...d throughout this publication are shown with a generic filesystem name of cls12345 Trademarks 2019 Cray Inc All rights reserved All trademarks used in this document are the property of their respective owners About ClusterStor MMU and AMMU Addition H 6167 4 ...

Page 5: ... MDTs operating through multiple MDS nodes to be configured and to operate as part of a single file system This feature allows the number of metadata operations per second within a cluster to scale beyond the capabilities of a standard storage system s single MDS To achieve this capability requires that the file system namespace be configured manually so that file system operations are evenly dist...

Page 6: ... Cooling Modules PCMs An AMMU is a 4U enclosure assembly of servers and storage that consists of the following Two 2 discrete 1U servers each server with dual power supplies and a local disk for HA state One 1 2U24 EBOD consisting of 22 10K RPM HDDs ten for each MDT and two global hot spares Two 2 PCMs MMUs and AMMUs support 900GB or 1 8TB HDDs Lustre MDS Functionality Each MMU AMMU presents as tw...

Page 7: ... Command Line Interface CSCLI ClusterStor Command Line Interface DHCP Dynamic Host Configuration Protocol DNE Distributed Namespace EBOD Expanded Bunch of Disks FGR Fine grained Routing GUI Graphical User Interface HA High Availability HPC High Performance Computing IBS0 IBS1 OPA0 OPA 1 40G0 40G1 or 100G0 100G1 Notation for the LDN switches InfiniBand Omni Path or high speed Ethernet IPMI Intellig...

Page 8: ...DN Power on the MMU AMMU nodes and run the discovery procedure which provisions MMU AMMU network interfaces and storage volumes Register MDS MDT components with the Lustre Management Server MGS and configure them as part of the file system To add more than one 1 MMU at a time it is recommended that the entire addition procedure be performed and completed sequentially for each new MMU Do not procee...

Page 9: ...red to complete the Customer Wizard installation prior to performing MMU AMMU Addition The storage system must be in Daily Mode before the MMU AMMU Addition procedure is performed This procedure requires the use of a single Lustre client system to perform end to end verification of some procedures Lustre client systems are not provided as part of the storage system Usually a compute node in an HPC...

Page 10: ...ge rack the MMU enclosure should occupy rack positions 37 and 38 An AMMU enclosure should occupy rack positions 35 through 38 See the following diagram Figure 2 Storage Racks with MMU and AMMU Enclosures Added Add an MMU or AMMU to a ClusterStor L300 and L300N System H 6167 10 ...

Page 11: ...to 70 minutes The timings above do not take into consideration additional time that is needed when adding an MMU AMMU to a running system In such a case before the new MDS nodes can be used on the system Lustre must be stopped and beSystemNetConfig should be re run to ensure that proper NIDs and parameters are applied to the newly added MDS nodes The time required for this is approximately 15 to 2...

Page 12: ...U EAC Module MMU 1 3 5 7 9 11 13 15 17 19 21 23 14 16 18 20 22 24 25 27 29 31 33 35 26 28 30 32 34 36 2 4 8 10 12 6 1 3 5 7 9 11 13 15 17 19 21 23 14 16 18 20 22 24 25 27 29 31 33 35 26 28 30 32 34 36 2 4 8 10 12 6 Figure 4 AMMU Server Cabling For recommended port assignments for both the MMU and AMMU contact Cray Support to obtain internal cabling documentation Add an MMU or AMMU to a ClusterStor...

Page 13: ...MDT0000 exports 172 18 1 188 o2ib uuid 2412630a db8d 806a 09eb 0690c8e1e86b It is recommended that every effort be made to identify and stop Lustre clients before performing the MMU AMMU Addition procedure In the sample output above 172 18 1 188 o2ib shown underscored represents a client that still has the Lustre file system mounted and needs to be addressed See Step 3 on page 14 if it is necessar...

Page 14: ...SS nodes and OSTs is disabling The displayed values change to disabled once the cache is completely flushed c Wait until the value displayed in the Caching State field is disabled for all OSS nodes and OSTs before proceeding to the next step admin cls12345n000 cscli nxd list Host Cache Caching Total Cache Cache Cache Bypass Group State Cache Size Block Window IO Size In Use Size Size Size cls12345...

Page 15: ...2345n004 oss 0 2 cls12345n005 dev md0 dev md2 cls12345n005 oss 0 2 cls12345n004 dev md1 dev md3 8 Following addition of the MMU AMMU it is possible for the pool of IP addresses that are available for LDN use to become exhausted To prevent this issue check the range of permitted IP addresses and if it is too small increase it First determine the number of nodes on the system after MMU AMMU addition...

Page 16: ...tend_range i 1 t 172 31 173 254 11 Verify the new range settings admin cls12345n000 cscli lustre_network find_gaps The range maximum value should match the value from Step 8 on page 15 12 Validate the new address range to ensure it is compatible with the netmask and does not overlap with other networks admin cls12345n000 sudo su root cls12345n000 opt xyratex bin beSystemNetConfig sh V nodeattr VU ...

Page 17: ...w the list of new nodes that is the nodes booted into discovery mode MGMT0 cscli show_new_nodes For example admin cls12345n000 cscli show_new_nodes Hostname MAC IPMI Free arrays Assigned arrays Pass Fail 00 50 CC 79 0A 66 172 16 0 111 Passed 00 50 CC 79 20 E0 172 16 0 112 Pending New nodes listed with a Pass Fail status of Passed have booted into discovery mode and are ready New nodes with a Pass ...

Page 18: ...nformation configure_hosts Waiting until node reboots configure_hosts done The cscli configure_hosts command allows any host name to be set as long as it is unique and not already in use The host names do not have to be contiguous numbers For each new MDS node pair it is important that the host name assignment and subsequent reboot occur within a short period of time of each other Chaining the hos...

Page 19: ...sh commands in this bullet will not work on a system that uses Intel Omni Path switches sudo ibhosts NOTE The sudo ibhosts command in this bullet will not work on a system that uses Intel Omni Path switches pdsh w new_nodes ethtool eth20 grep Speed or pdsh w new_nodes ibstat grep Link Notes Run all commands from the primary MGMT node For example MGMT0 cat etc hosts The IP addresses for BMC IB and ...

Page 20: ... ib0 Example output for command pdsh a date sort admin cls12345n000 pdsh a date sort cls12345n000 Wed Sep 4 12 11 32 PDT 2013 cls12345n001 Wed Sep 4 12 11 32 PDT 2013 cls12345n002 Wed Sep 4 12 11 32 PDT 2013 cls12345n003 Wed Sep 4 12 11 32 PDT 2013 cls12345n004 Wed Sep 4 12 11 32 PDT 2013 cls12345n005 Wed Sep 4 12 11 32 PDT 2013 cls12345n006 Wed Sep 4 12 11 32 PDT 2013 cls12345n007 Wed Sep 4 12 11...

Page 21: ...Band or Omni Path connectivity run admin cls12345n000 pdsh w cls12345n006 cls12345n007 ibstat grep Link cls12345n006 Link layer InfiniBand cls12345n007 Link layer InfiniBand Complete the MMU AMMU Addition Procedure About this task Once the MMU AMMU has been powered up and MDS node discovery has been successful perform these remaining steps to complete the MMU AMMU Addition procedure Procedure 1 Op...

Page 22: ...inutes for the two new MDS nodes in an MMU AMMU Once configuration is complete the new nodes automatically default to rack777 at location 1U which needs to move to the current rack location or a new one before power cycling the nodes 4 Verify the current rack list admin cls12345n000 cscli rack list Rack R1C1 5 enclosure s Rack R1C2 2 enclosure s Rack rack777 2 enclosure s 5 Verify the next availab...

Page 23: ...ollow the remaining steps to add the new MMU or AMMU to the new rack 6 Move the new nodes from rack777 to their new location This example shows moving the new nodes to R1C2 MGMT0 cscli rack move location NEW_LOCATION serial_no SERIAL_NO This example moves the new nodes enclosure serial number SHM1007184RC3P9 to rack R1C2 position 37U admin cls12345n000 cscli rack move serial_no SHM1007184RC3P9 loc...

Page 24: ...on Wait until the nodes are booted into the regular appliance image before continuing to the next step Use the command conman hostname to monitor the node boot process 10 Check to see if nodes are up MGMT0 pdsh a uptime For example admin cls12345n000 pdsh a uptime cls12345n006 ssh connect to host cls12345n006 port 22 Connection refused pdsh cls12345n000 cls12345n006 ssh exited with exit code 255 c...

Page 25: ...l nodes potentially reconfiguring FGR on those nodes admin cls12345n000 sudo su root cls12345n000 opt xyratex bin beSystemNetConfig sh D c path to lnet conf i path to ip2nets r path to routes nodeattr VU cluster root cls12345n000 exit admin cls12345n000 13 Start Lustre on the system MGMT0 cscli mount f filesystem_name For example admin cls12345n000 cscli mount f cls12345 mount MGS is starting moun...

Page 26: ...equested by admin In progress 30 Total 2 bundles For more information about working with support bundles see the CSM online help 16 Verify and update firmware if required a Update GEM GOBI firmware on systems running releases later than 2 1 0 SU 004 For additional information about updating GEM GOBI firmware please contact Cray Support b Update GEM USM firmware for systems running releases prior t...

Page 27: ...345n006 maxencs 256 cls12345n006 nossd_ok 0 cls12345n006 disklim 0 cls12345n006 stayactive 0 cls12345n006 mkfs no cls12345n006 cmu_no_external 1 cls12345n006 r15k_rots_needed 0 cls12345n006 prod_id cls12345n006 external_per_encl 0 cls12345n006 t10format no cls12345n006 t10format_arg cls12345n006 t10dif cls12345n006 op_mode cls12345n006 no_fw_check 0 cls12345n006 sentinel 0 cls12345n006 NXD true cl...

Page 28: ...wn 0x5000c5008dfdb153 dev disk by id wwn 0x5000c5008dfe876f dev disk by id wwn 0x5000c5008dfdee1b dev disk by id wwn 0x5000c5008dfe155f dev disk by id wwn 0x5000c5008dfe070f dev disk by id wwn 0x5000c5008dfdfb9f cls12345n006 created raid10 array dev md cls12345n006 md1 cls12345n006 Now see if the raid10 arrays showed up cls12345n006 Success mdadm create force name cls12345n006 md1 size 849609344 r...

Page 29: ...e 4096 i 4096 I 1024 q O dirdata uninit_bg extents mmp dir_nlink quota huge_file flex_bg E lazy_journal_init lazy_itable_init 0 F dev md1 1062011680 local Writing CONFIGS mountdata local local Permanent disk data local Target testfs MDT0002 local Index 2 local Lustre FS testfs local Mount type ldiskfs local Flags 0x61 local MDT first_time update local Persistent mount opts errors remount ro user_x...

Page 30: ... var lib pacemaker cib cib 6 raw sig cls12345n007 var lib pacemaker cib cib 7 raw cls12345n007 var lib pacemaker cib cib 7 raw sig cls12345n007 var lib pacemaker cib cib 8 raw cls12345n007 var lib pacemaker cib cib 8 raw sig cls12345n007 var lib pacemaker cib cib 9 raw cls12345n007 var lib pacemaker cib cib 9 raw sig cls12345n007 var lib pacemaker cib cib last cls12345n007 var lib pacemaker cib ci...

Page 31: ...tal file list cls12345n006 var lib mdraidscripts cls12345n006 var lib mdraidscripts dummy cls12345n006 var lib mdraidscripts initial crm cls12345n006 var lib mdraidscripts mdadm conf cls12345n006 var lib mdraidscripts mdadm conf 1532036847 cls12345n006 var lib mdraidscripts mdrc tar xz cls12345n007 cls12345n007 sent 2133 bytes received 56 bytes 4378 00 bytes sec cls12345n007 total size is 1918 spe...

Page 32: ...ure to use when node does not come online and SSH fails No Free Arrays Wrong md Assignment on page 36 Command to clear assignment if RAID arrays show as assigned Check ARP Local Hosts and DHCP on page 37 Commands to check during code discovery Post Upgrade Check on page 37 Alternative command to verify nodes were added Test Mount Lustre File System on Node 001 on page 37 Test on secondary MGMT nod...

Page 33: ...k Run the following procedure when a node is stuck in the discovery prompt Procedure 1 Log in to the node that is stuck in the discovery prompt admin customer_admin_password 2 Start the remote console via serial cable connected to the node or remote console via ipmitool Node ipmitool e I lanplus H IPMI_IP_Address U admin P admin sol activate where IPMI_IP_address is the BMC IP address of the troub...

Page 34: ...ion in the event of a disconnect event If there is a disconnect problem follow these steps to reconnect by using the terminal screen interface Procedure 1 Log in to the primary MGMT node via SSH Client ssh l admin primary_MGMT_node 2 Run the following command admin cls12345n000 screen x If multiple screen sessions are running on the primary MGMT node the output will be similar to the following adm...

Page 35: ... D3 D2 BE 44 77 cls12345n006 86 37 84 1B AB 1D 15 66 18 4F 6C E9 1F 8F 30 BC cls12345n007 D8 FB 1B CE 17 CE 0D 53 6A E2 AC 7F D1 DB 92 83 cls12345n008 EA 9F AD FD 4C 1E B1 F9 30 02 B8 57 BA 61 5D CF cls12345n009 93 AE 25 D7 D9 E6 75 97 75 ED AA C2 CF 16 AE EA cls12345n010 F6 38 AE BB FA E6 3E 27 5A 90 E7 60 FF 5E CF E6 cls12345n011 93 16 69 8C E4 ED BC 9A 4A 55 72 94 43 48 86 17 SSH Connection Ref...

Page 36: ... 6 2 Carbon Kernel 2 6 32 220 7 1 el6 lustre 4119 x86_64 on an x86_64 cls12345n006 login username Password userpassword NOTE The preferred method for this procedure is to use the conman command to access the MDS or OSS node If that fails to connect to the node connect a serial cable to the actual node serial port log in to the node and then run the commands in Steps 4 and 5 and then proceed to Ste...

Page 37: ...DT0001_UUID ACTIVE 2 lustre MDT0002_UUID ACTIVE 3 lustre MDT0003_UUID ACTIVE Test Mount Lustre File System on Node 001 About this task It is recommended that a Lustre client system be used to perform a test mount of the Lustre file system The following instructions and examples illustrate a test mount from the secondary MGMT node Procedure 1 Obtain the Lustre Network Identifier NID of the MGS node...

Page 38: ...ount 4 Verify the success of the mount command admin cls12345n001 lfs df h The example output in Step 5 shows a successfully mounted Lustre file system 5 Unmount the Lustre file system from the temporary mount point admin cls12345n001 umount temp_mount_point For example admin cls12345n001 umount root testmount The following sample output incorporates all of the commands in the preceeding steps 2 t...

Page 39: ...ordered proc bus usb on proc bus usb type usbfs rw relatime dev mapper vg0 lv_bkup on bkup type ext3 rw relatime errors continue user_xattr acl barrier 1 data ordered dev md127p1 on boot type ext3 rw relatime errors continue user_xattr acl barrier 1 data ordered none on dev shm type tmpfs rw relatime none on proc sys fs binfmt_misc type binfmt_misc rw relatime sunrpc on var lib nfs rpc_pipefs type...

Reviews: