background image

Maintain

ONTAP Systems

NetApp
November 23, 2021

This PDF was generated from https://docs.netapp.com/us-en/ontap-systems/fas8200/bootmedia-replace-
overview.html on November 23, 2021. Always check docs.netapp.com for the latest.

Summary of Contents for FAS8200 Series

Page 1: ...ONTAP Systems NetApp November 23 2021 This PDF was generated from https docs netapp com us en ontap systems fas8200 bootmedia replace overview html on November 23 2021 Always check docs netapp com for the latest ...

Page 2: ...caching module FAS8200 25 Chassis 36 Controller 50 Replace a DIMM FAS8200 77 Swap out a fan FAS8200 89 Replace the NVMEM battery FAS8200 91 Replace a PCIe card FAS8200 102 Swap out a power supply FAS8200 111 Replace the real time clock battery FAS8200 113 ...

Page 3: ...ese steps on the correct node The impaired node is the node on which you are performing maintenance The healthy node is the HA partner of the impaired node Check onboard encryption keys as needed FAS8200 Prior to shutting down the impaired node and checking the status of the onboard encryption keys you must check the status of the impaired node disable automatic giveback and check what version of ...

Page 4: ...m has either NetApp Volume Encryption NVE or NetApp Storage Encryption NSE enabled If so you need to verify the configuration Steps 1 Connect the console cable to the impaired node 2 Check whether NVE is configured for any volumes in the cluster volume show is encrypted true If any volumes are listed in the output NVE is configured and you need to verify the NVE configuration If no volumes are lis...

Page 5: ...l need it in disaster scenarios where you might need to manually recover OKM Return to admin mode set priv admin Shut down the impaired node b If the Restored column displays anything other than yes Run the key manager setup wizard security key manager setup node target impaired node name Enter the customer s onboard key management passphrase at the prompt If the passphrase cannot be provided cont...

Page 6: ... manager key show detail a If the Restored column displays yes manually backup the onboard key management information Go to advanced privilege mode and enter y when prompted to continue set priv advanced Enter the command to display the OKM backup information security key manager backup show Copy the contents of the backup information to a separate file or your log file You ll need it in disaster ...

Page 7: ...ation 1 Display the key IDs of the authentication keys that are stored on the key management servers security key manager query If the Key Manager type displays external and the Restored column displays yes it s safe to shut down the impaired node If the Key Manager type displays onboard and the Restored column displays yes you need to complete some additional steps If the Key Manager type display...

Page 8: ...gement backup information security key manager onboard show backup f Copy the contents of the backup information to a separate file or your log file You ll need it in disaster scenarios where you might need to manually recover OKM g Return to admin mode set priv admin h You can safely shutdown the node Verify NSE configuration 1 Display the key IDs of the authentication keys that are stored on the...

Page 9: ...yes a Enter the onboard security key manager sync command security key manager onboard sync Enter the customer s onboard key management passphrase at the prompt If the passphrase cannot be provided contact NetApp Support mysupport netapp com b Verify the Restored column shows yes for all authentication keys security key manager key query c Verify that the Key Manager type shows onboard manually ba...

Page 10: ...de and if necessary take over the node so that the healthy node continues to serve data from the impaired node storage If you have a cluster with more than two nodes it must be in quorum If the cluster is not in quorum or a healthy node shows false for eligibility and health you must correct the issue before shutting down the impaired node see the Administration overview with the CLI If you have a...

Page 11: ... NetApp Storage Encryption you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode section of Administration overview with the CLI You must leave the power supplies turned on at the end of this procedure to provide power to the healthy node Steps 1 Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healt...

Page 12: ...me 7 25 2016 18 45 55 End Time 7 25 2016 18 45 56 Errors 5 Check the state of the aggregates by using the storage aggregate show command controller_A_1 storage aggregate show Aggregate Size Available Used State Vols Nodes RAID Status aggr_b2 227 1GB 227 1GB 0 online 0 mcc1 a2 raid_dp mirrored normal 6 Heal the root aggregates by using the metrocluster heal phase root aggregates command mcc1A metro...

Page 13: ...omponents inside the controller you must first remove the controller module from the system and then remove the cover on the controller module 1 If you are not already grounded properly ground yourself 2 Loosen the hook and loop strap binding the cables to the cable management device and then unplug the system cables and SFPs if needed from the controller module keeping track of where the cables w...

Page 14: ... bottom of the controller module as you slide it out of the chassis Step 2 Replace the boot media You must locate the boot media in the controller and follow the directions to replace it 1 If you are not already grounded properly ground yourself 2 Locate the boot media using the following illustration or the FRU map on the controller module 12 ...

Page 15: ... gently push it into the socket 5 Check the boot media to make sure that it is seated squarely and completely in the socket If necessary remove the boot media and reseat it into the socket 6 Push the boot media down to engage the locking button on the boot media housing 7 Close the controller module cover Step 3 Transfer the boot image to the boot media You can install the system image to the repl...

Page 16: ...s and not in the USB console port 4 Push the controller module all the way into the system making sure that the cam handle clears the USB flash drive firmly push the cam handle to finish seating the controller module push the cam handle to the closed position and then tighten the thumbscrew The node begins to boot as soon as it is completely installed into the chassis 5 Interrupt the boot process ...

Page 17: ... manual connections ifconfig e0a addr filer_addr mask netmask gw gateway dns dns_addr domain dns_domain filer_addr is the IP address of the storage system netmask is the network mask of the management network that is connected to the HA partner gateway is the gateway for the network dns_addr is the IP address of a name server on your network dns_domain is the Domain Name System DNS domain name If ...

Page 18: ... prompted to restore the backup configuration b Set the healthy node to advanced privilege level set privilege advanced c Run the restore backup command system node restore backup node local target address impaired_node_IP_address d Return the node to admin level set privilege admin e Press y when prompted to use the restored configuration f Press y when prompted to reboot the node No network conn...

Page 19: ...alse revert those interfaces back to their home port using the net int revert command 10 Move the console cable to the repaired node and run the version v command to check the ONTAP versions 11 Restore automatic giveback if you disabled it by using the storage failover modify node local auto giveback true command Option 2 Controller is in a two node MetroCluster You must boot the ONTAP image from ...

Page 20: ...r switchback operation This returns the configuration to its normal operating state with the sync source storage virtual machines SVMs on the formerly impaired site now active and serving data from the local disk pools This task only applies to two node MetroCluster configurations Steps 1 Verify that all nodes are in the enabled state metrocluster node show cluster_B metrocluster node show DR Conf...

Page 21: ...SE and NVE as needed FAS8200 Once environment variables are checked you must complete steps specific to systems that have Onboard Key Manager OKM NetApp Storage Encryption NSE or NetApp Volume Encryption NVE enabled Determine which section you should use to restore your OKM NSE or NVE configurations If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at ...

Page 22: ... show backup command Example of backup data BEGIN BACKUP TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA LRzUQRHwv 1aWvAAAAAAAAA...

Page 23: ...oard key management when prompted b Enter the key manager key show detail command to see a detailed view of all keys stored in the onboard key manager and verify that the Restored column yes for all authentication keys If the Restored column anything other than yes contact Customer Support c Wait 10 minutes for the key to synchronize across the cluster 14 If you are running ONTAP 9 6 or later run ...

Page 24: ...and If the command fails because of a failed disk physically dis engage the failed disk but leave the disk in the slot until a replacement is received If the command fails because of an open CIFS sessions check with customer how to close out CIFS sessions Terminating CIFS can cause loss of data If the command fails because the partner not ready wait 5 minutes for the NVMEMs to synchronize If the c...

Page 25: ...to see a detailed view of all keys stored in the onboard key manager b Use the security key manager key show detail command and verify that the Restored column yes for all authentication keys If the Restored column anything other than yes use the security key manager setup node Repaired Target node command to restore the Onboard Key Management settings Rerun the security key manager key show detai...

Page 26: ...he target node and run the version v command to check the ONTAP versions 8 Restore automatic giveback if you disabled it by using the storage failover modify node local auto giveback true command 9 Use the storage encryption disk show at the clustershell prompt to review the output 10 Use the security key manager key query command to display the key IDs of the authentication keys that are stored o...

Page 27: ...ut should display the caching module status as erased You must replace the failed component with a replacement FRU component you received from your provider Step 1 Shut down the impaired controller You can shut down or take over the impaired controller using different procedures depending on the storage system hardware configuration Option 1 Most configurations To shut down the impaired node you m...

Page 28: ... cluster with more than two nodes it must be in quorum If the cluster is not in quorum or a healthy node shows false for eligibility and health you must correct the issue before shutting down the impaired node see the Administration overview with the CLI If you have a MetroCluster configuration you must have confirmed that the MetroCluster Configuration State is configured and that the nodes are i...

Page 29: ...on of Administration overview with the CLI You must leave the power supplies turned on at the end of this procedure to provide power to the healthy node Steps 1 Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node metrocluster show 2 Depending on whether an automatic switchover has occurred proceed according to the following table...

Page 30: ...eck the state of the aggregates by using the storage aggregate show command controller_A_1 storage aggregate show Aggregate Size Available Used State Vols Nodes RAID Status aggr_b2 227 1GB 227 1GB 0 online 0 mcc1 a2 raid_dp mirrored normal 6 Heal the root aggregates by using the metrocluster heal phase root aggregates command mcc1A metrocluster heal phase root aggregates Job 137 Job succeeded Heal...

Page 31: ...roller module 1 If you are not already grounded properly ground yourself 2 Loosen the hook and loop strap binding the cables to the cable management device and then unplug the system cables and SFPs if needed from the controller module keeping track of where the cables were connected Leave the cables in the cable management device so that when you reinstall the cable management device the cables a...

Page 32: ... follow the specific sequence of steps Your storage system must meet certain criteria depending on your situation It must have the appropriate operating system for the caching module you are installing It must support the caching capacity All other components in the storage system must be functioning properly if not you must contact technical support 1 Locate the caching module at the rear of the ...

Page 33: ...to the socket 5 Reseat and push the heatsink down to engage the locking button on the caching module housing 6 Repeat the steps if you have a second caching module Close the controller module cover as needed Step 4 Reinstall the controller After you replace a component within the controller module you must reinstall the controller module in the system chassis and boot it to a state where you can r...

Page 34: ... the boot process when you see the message Press Ctrl C for Boot Menu f Select the option to boot to Maintenance mode from the displayed menu Step 5 Run system level diagnostics After installing a new caching module you should run diagnostics Your system must be at the LOADER prompt to start System Level Diagnostics All commands in the diagnostic procedures are issued from the node where the compo...

Page 35: ...isplayed SLDIAG No log messages are present c Exit Maintenance mode halt The node displays the LOADER prompt d Boot the node from the LOADER prompt bye e Return the node to normal operation If your node is in Then An HA pair Perform a give back storage failover giveback ofnode replacement_node_name If you disabled automatic giveback re enable it with the storage failover modify command A two node ...

Page 36: ... by pressing Ctrl C when prompted to get to the Boot menu If you have two controller modules in the chassis fully seat the controller module you are servicing in the chassis The controller module boots up when fully seated If you have one controller module in the chassis connect the power supplies and then turn them on e Select Boot to maintenance mode from the menu f Exit Maintenance mode by ente...

Page 37: ...chback command from any node in the surviving cluster 5 Verify that the switchback operation has completed metrocluster show The switchback operation is still running when a cluster is in the waiting for switchback state cluster_B metrocluster show Cluster Configuration State Mode Local cluster_B configured switchover Remote cluster_A configured waiting for switchback The switchback operation is c...

Page 38: ...n that you are moving the controller module or modules to the new chassis and that the chassis is a new component from NetApp This procedure is disruptive For a two node cluster you will have a complete service outage and a partial outage in a multi node cluster Shut down the controllers FAS8200 Option 1 Most configurations You must shut down the node or nodes in the chassis prior to moving them t...

Page 39: ...this procedure If repeated attempts to cleanly shut down the node fail be aware that you might lose any data that was not saved to disk 3 Where applicable halt the second node to avoid a possible quorum error message in an HA pair configuration system node halt node second_node_name ignore quorum warnings true Option 2 Controller is in a two node MetroCluster configuration To shut down the impaire...

Page 40: ...Job succeeded Heal Aggregates is successful If the healing is vetoed you have the option of reissuing the metrocluster heal command with the override vetoes parameter If you use this optional parameter the system overrides any soft vetoes that prevent the healing operation 4 Verify that the operation has been completed by using the metrocluster operation show command controller_A_1 metrocluster op...

Page 41: ...rors 8 On the impaired controller module disconnect the power supplies Move and replace hardware FAS8200 Step 1 Move a power supply Moving out a power supply when replacing a chassis involves turning off disconnecting and removing the power supply from the old chassis and installing and connecting it on the replacement chassis 1 If you are not already grounded properly ground yourself 2 Turn off t...

Page 42: ...Power supply Cam handle release latch Power and Fault LEDs Cam handle 40 ...

Page 43: ...at it all the way into the chassis and then push the cam handle to the closed position making sure that the cam handle release latch clicks into its locked position 8 Reconnect the power cable and secure it to the power supply using the power cable locking mechanism Only connect the power cable to the power supply Do not connect the power cable to a power source at this time Step 2 Move a fan Movi...

Page 44: ...ing fan modules 6 Insert the fan module into the replacement chassis by aligning it with the opening and then sliding it into the chassis 7 Push firmly on the fan module cam handle so that it is seated all the way into the chassis The cam handle raises slightly when the fan module is completely seated 8 Swing the cam handle up to its closed position making sure that the cam handle release latch cl...

Page 45: ...o that when you reinstall the cable management device the cables are organized 2 Remove and set aside the cable management devices from the left and right sides of the controller module 3 Loosen the thumbscrew on the cam handle on the controller module Thumbscrew Cam handle 4 Pull the cam handle downward and begin to slide the controller module out of the chassis Make sure that you support the bot...

Page 46: ...re the front of the chassis to the equipment rack or system cabinet using the screws you removed from the old chassis 7 If you have not already done so install the bezel Step 4 Install the controller After you install the controller module and any other components into the new chassis boot it to a state where you can run the interconnect diagnostic test For HA pairs with two controller modules in ...

Page 47: ...is fully seated and then close the cam handle to the locked position Tighten the thumbscrew on the cam handle on back of the controller module Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors b If you have not already done so reinstall the cable management device c Bind the cables to the cable management device with the hook and loop s...

Page 48: ...Go to Completing the replacement process An HA pair with a second controller module Exit Maintenance mode halt The LOADER prompt appears Step 2 Run system level diagnostics After installing a new chassis you should run interconnect diagnostics Your system must be at the LOADER prompt to start System Level Diagnostics All commands in the diagnostic procedures are issued from the node where the comp...

Page 49: ...g device status dev interconnect long state failed System level diagnostics returns you to the prompt if there are no test failures or lists the full status of failures resulting from testing the component 7 Proceed based on the result of the preceding step If the system level diagnostics tests Then Were completed without any failures a Clear the status logs sldiag device clearstatus b Verify that...

Page 50: ...disconnect the power supplies c Verify that you have observed all of the considerations identified for running system level diagnostics that cables are securely connected and that hardware components are properly installed in the storage system d Reconnect the power supplies and then power on the storage system e Rerun the system level diagnostics test Step 3 Switch back aggregates in a two node M...

Page 51: ...surviving cluster 5 Verify that the switchback operation has completed metrocluster show The switchback operation is still running when a cluster is in the waiting for switchback state cluster_B metrocluster show Cluster Configuration State Mode Local cluster_B configured switchover Remote cluster_A configured waiting for switchback The switchback operation is complete when the clusters are in the...

Page 52: ...luster configuration is the same as that in an HA pair No MetroCluster specific steps are required because the failure is restricted to an HA pair and storage failover commands can be used to provide nondisruptive operation during the replacement This procedure includes steps for automatically or manually reassigning drives to the replacement node depending on your system s configuration You shoul...

Page 53: ...dministration overview with the CLI Steps 1 If AutoSupport is enabled suppress automatic case creation by invoking an AutoSupport message system node autosupport invoke node type all message MAINT _number_of_hours_down_h The following AutoSupport message suppresses automatic case creation for two hours cluster1 system node autosupport invoke node type all message MAINT 2h 2 If the impaired node is...

Page 54: ...l message MAINT 2h 2 Disable automatic giveback from the console of the healthy node storage failover modify node local auto giveback false 3 Take the impaired node to the LOADER prompt If the impaired node is displaying Then The LOADER prompt Go to the next step Waiting for giveback Press Ctrl C and then respond y when prompted System prompt or password prompt enter system password Take over or h...

Page 55: ...again If you are unable to resolve the issue contact technical support 3 Resynchronize the data aggregates by running the metrocluster heal phase aggregates command from the surviving cluster controller_A_1 metrocluster heal phase aggregates Job 130 Job succeeded Heal Aggregates is successful If the healing is vetoed you have the option of reissuing the metrocluster heal command with the override ...

Page 56: ...ster mcc1A metrocluster operation show Operation heal root aggregates State successful Start Time 7 29 2016 20 54 41 End Time 7 29 2016 20 54 42 Errors 8 On the impaired controller module disconnect the power supplies Move the controller module hardware FAS8200 To replace the controller module hardware you must remove the impaired node move FRU components to the replacement controller module insta...

Page 57: ...dule 5 Loosen the thumbscrew on the cam handle on the controller module Thumbscrew Cam handle 6 Pull the cam handle downward and begin to slide the controller module out of the chassis Make sure that you support the bottom of the controller module as you slide it out of the chassis Step 2 Move the boot device You must locate the boot media and follow the directions to remove it from the old contro...

Page 58: ...the edges of the boot media with the socket housing and then gently push it into the socket 4 Check the boot media to make sure that it is seated squarely and completely in the socket If necessary remove the boot media and reseat it into the socket 5 Push the boot media down to engage the locking button on the boot media housing Step 3 Move the NVMEM battery To move the NVMEM battery from the old ...

Page 59: ...ging contents to the flash memory when you halt the system After the destage is complete the LED turns off If power is lost without a clean shutdown the NVMEM LED flashes until the destage is complete and then the LED turns off If the LED is on and power is on unwritten data is stored on NVMEM This typically occurs during an uncontrolled shutdown after ONTAP has successfully booted 2 Open the CPU ...

Page 60: ... steps 1 Locate the DIMMs on your controller module 2 Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation 3 Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM and then slide the DIMM out of the slot Carefully hold the DIMM by the edges to avoid pressure o...

Page 61: ...teps for the remaining DIMMs 7 Move the NVMEM battery to the replacement controller module 8 Align the tab or tabs on the battery holder with the notches in the controller module side and then gently push down on the battery housing until the battery housing clicks into place Step 5 Move a PCIe card To move PCIe cards locate and move them from the old controller into the replacement controller and...

Page 62: ...peat the preceding step for the remaining PCIe cards in the old controller module 5 Open the new controller module side panel if necessary slide off the PCIe card filler plate as needed and carefully install the PCIe card Be sure that you properly align the card in the slot and exert even pressure on the card when seating it in the socket The card must be fully and evenly seated in the slot 6 Repe...

Page 63: ...the heatsink The storage system comes with two slots available for the caching module and only one slot is occupied by default 2 Move the caching module to the new controller module and then align the edges of the caching module with the socket housing and gently push it into the socket 3 Verify that the caching module is seated squarely and completely in the socket If necessary remove the caching...

Page 64: ...pt the boot process which you can typically do at any time after prompted to do so However if the system updates the system firmware when it boots you must wait until after the update is complete before interrupting the boot process 1 If you are not already grounded properly ground yourself 2 If you have not already done so close the CPU air duct 3 Align the end of the controller module with the o...

Page 65: ...ce when sliding the controller module into the chassis to avoid damaging the connectors The controller begins to boot as soon as it is seated in the chassis b If you have not already done so reinstall the cable management device c Bind the cables to the cable management device with the hook and loop strap d When you see the message Press Ctrl C for Boot Menu press Ctrl C to interrupt the boot proc...

Page 66: ...to Maintenance mode e From the boot menu select the option for Maintenance mode Important During the boot process you might see the following prompts A prompt warning of a system ID mismatch and asking to override the system ID A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that the healthy node remains down You can safely respond y to these prompts Res...

Page 67: ... set the HA state of the controller module You must verify the HA state of the controller module and if necessary update the state to match your system configuration 1 In Maintenance mode from the new controller module verify that all components display the same HA state ha config show The HA state should be the same for all components 2 If the displayed system state of the controller module does ...

Page 68: ... prompts until the Maintenance mode prompt appears 3 Display and note the available devices on the controller module sldiag device show dev mb The controller module devices and ports displayed can be any one or more of the following bootmedia is the system booting device cna is a Converged Network Adapter or interface not connected to a network or storage device fcal is a Fibre Channel Arbitrated ...

Page 69: ...ests that you want to run sldiag device modify dev dev_name selection only selection only disables all other tests that you do not want to run for the device d Run the selected tests sldiag device run dev dev_name After the test is complete the following message is displayed SLDIAG _ALL_TESTS_COMPLETED e Verify that no tests failed sldiag device status dev dev_name long state failed System level d...

Page 70: ...want to run for the device d Verify that the tests were modified sldiag device show e Repeat these substeps for each device that you want to run concurrently f Run diagnostics on all of the devices sldiag device run Do not add to or modify your entries after you start running diagnostics After the test is complete the following message is displayed SLDIAG _ALL_TESTS_COMPLETED g Verify that there a...

Page 71: ...all of the considerations identified for running system level diagnostics that cables are securely connected and that hardware components are properly installed in the storage system d Reconnect the power supplies and then power on the storage system e Rerun the system level diagnostics test Recable the system and reassign disks FAS8200 Continue the replacement procedure by recabling the storage a...

Page 72: ...d go to the LOADER prompt halt 2 From the LOADER prompt on the replacement node boot the node entering y if you are prompted to override the system ID due to a system ID mismatch boot_ontap 3 Wait until the Waiting for giveback message is displayed on the replacement node console and then from the healthy node verify that the new partner system ID has been automatically assigned storage failover s...

Page 73: ...guration Guide for your version of ONTAP 9 b After the giveback has been completed confirm that the HA pair is healthy and that takeover is possible storage failover show The output from the storage failover show command should not include the System ID changed on partner message 6 Verify that the disks were assigned correctly storage disk show ownership The disks belonging to the replacement node...

Page 74: ...must enter Y when prompted to override the system ID due to a system ID mismatch 2 View the old system IDs from the healthy node metrocluster node show fields node systemid dr partner systemid In this example the Node_B_1 is the old node with the old system ID of 118073209 dr group id cluster node node systemid dr partner systemid 1 Cluster_A Node_A_1 536872914 118073209 1 Cluster_B Node_B_1 11807...

Page 75: ...hen prompted to continue into advanced mode The advanced mode prompt appears b Verify that the coredumps are saved system node run node local node name partner savecore If the command output indicates that savecore is in progress wait for savecore to complete before issuing the giveback You can monitor the progress of the savecore using the system node run node local node name partner savecore s c...

Page 76: ... any issues discovered 12 Simulate a switchover operation a From any node s prompt change to the advanced privilege level set privilege advanced You need to respond with y when prompted to continue into advanced mode and see the advanced mode prompt b Perform the switchback operation with the simulate parameter metrocluster switchover simulate c Return to the admin privilege level set privilege ad...

Page 77: ...enses license clean up unused Step 2 Restore Storage and Volume Encryption functionality After replacing the controller module or NVRAM module for a storage system that you previously configured to use Storage or Volume Encryption you must perform additional steps to provide uninterrupted Encryption functionality You can skip this task on storage systems that do not have Storage or Volume Encrypti...

Page 78: ...al disk pools This task only applies to two node MetroCluster configurations Steps 1 Verify that all nodes are in the enabled state metrocluster node show cluster_B metrocluster node show DR Configuration DR Group Cluster Node State Mirroring Mode 1 cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries...

Page 79: ...l support at NetApp Support 888 463 8277 North America 00 800 44 638277 Europe or 800 800 80 800 Asia Pacific if you need the RMA number or additional help with the replacement procedure Replace a DIMM FAS8200 You must replace a DIMM in the controller module when your system registers an increasing number of correctable error correction codes ECC failure to do so causes a system panic All other co...

Page 80: ...trl C and then respond y when prompted System prompt or password prompt enter system password Take over or halt the impaired node For an HA pair take over the impaired node from the healthy node storage failover takeover ofnode impaired_node_name When the impaired node shows Waiting for giveback press Ctrl C and then respond y Option 2 Controller is in a MetroCluster Do not use this procedure if y...

Page 81: ...ailover takeover ofnode impaired_node_name When the impaired node shows Waiting for giveback press Ctrl C and then respond y Option 3 Controller is in a two node MetroCluster To shut down the impaired node you must determine the status of the node and if necessary switch over the node so that the healthy node continues to serve data from the impaired node storage About this task If you are using N...

Page 82: ... is successful If the healing is vetoed you have the option of reissuing the metrocluster heal command with the override vetoes parameter If you use this optional parameter the system overrides any soft vetoes that prevent the healing operation 4 Verify that the operation has been completed by using the metrocluster operation show command controller_A_1 metrocluster operation show Operation heal a...

Page 83: ...ired controller module disconnect the power supplies Step 2 Open the controller module To access components inside the controller you must first remove the controller module from the system and then remove the cover on the controller module 1 If you are not already grounded properly ground yourself 2 Loosen the hook and loop strap binding the cables to the cable management device and then unplug t...

Page 84: ...ED is located on the back of the controller module Look for the following icon 2 If the NVMEM LED is not flashing there is no content in the NVMEM you can skip the following steps and proceed to the next task in this procedure 3 Unplug the battery The NVMEM LED blinks while destaging contents to the flash memory when you halt the system After the destage is complete the LED turns off If power is l...

Page 85: ...the NVMEM LED on the controller module 5 Locate the DIMMs on your controller module Each system memory DIMM has an LED located on the board next to each DIMM slot The LED for the faulty blinks every two seconds 6 Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation 7 Eject the DIMM from its slot by slowly pushing apart the two DIMM ej...

Page 86: ... The notch among the pins on the DIMM should line up with the tab in the socket 9 Make sure that the DIMM ejector tabs on the connector are in the open position and then insert the DIMM squarely into the slot The DIMM fits tightly in the slot but should go in easily If not realign the DIMM with the slot and reinsert it Visually inspect the DIMM to verify that it is evenly aligned and fully inserte...

Page 87: ...s fully seated in the chassis Be prepared to interrupt the boot process a With the cam handle in the open position firmly push the controller module in until it meets the midplane and is fully seated and then close the cam handle to the locked position Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors b Tighten the thumbscrew on the cam...

Page 88: ...rdware problems resulted from the replacement of the DIMMs sldiag device status dev mem long state failed System level diagnostics returns you to the prompt if there are no test failures or lists the full status of failures resulting from testing the component 5 Proceed based on the result of the preceding step If the system level diagnostics tests Then Were completed without any failures a Clear ...

Page 89: ...ntified for running system level diagnostics that cables are securely connected and that hardware components are properly installed in the storage system d Boot the controller module you are servicing interrupting the boot by pressing Ctrl C when prompted to get to the Boot menu If you have two controller modules in the chassis fully seat the controller module you are servicing in the chassis The ...

Page 90: ...nchronization is complete on all SVMs metrocluster vserver show 3 Verify that any automatic LIF migrations being performed by the healing operations were completed successfully metrocluster check lif show 4 Perform the switchback by using the metrocluster switchback command from any node in the surviving cluster 5 Verify that the switchback operation has completed metrocluster show The switchback ...

Page 91: ...with the replacement procedure Swap out a fan FAS8200 To swap out a fan module without interrupting service you must perform a specific sequence of tasks You must replace the fan module within two minutes of removing it from the chassis System airflow is disrupted and the controller module or modules shut down after two minutes to avoid overheating 1 If you are not already grounded properly ground...

Page 92: ...odule aside 7 Insert the replacement fan module into the chassis by aligning it with the opening and then sliding it into the chassis 8 Push firmly on the fan module cam handle so that it is seated all the way into the chassis The cam handle raises slightly when the fan module is completely seated 9 Swing the cam handle up to its closed position making sure that the cam handle release latch clicks...

Page 93: ...t determine the status of the node and if necessary take over the node so that the healthy node continues to serve data from the impaired node storage About this task If you have a cluster with more than two nodes it must be in quorum If the cluster is not in quorum or a healthy node shows false for eligibility and health you must correct the issue before shutting down the impaired node see the Ad...

Page 94: ...ligibility and health you must correct the issue before shutting down the impaired node see the Administration overview with the CLI If you have a MetroCluster configuration you must have confirmed that the MetroCluster Configuration State is configured and that the nodes are in an enabled and normal state metrocluster node show Steps 1 If AutoSupport is enabled suppress automatic case creation by...

Page 95: ... the CLI You must leave the power supplies turned on at the end of this procedure to provide power to the healthy node Steps 1 Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node metrocluster show 2 Depending on whether an automatic switchover has occurred proceed according to the following table If the impaired node Then Has aut...

Page 96: ...eck the state of the aggregates by using the storage aggregate show command controller_A_1 storage aggregate show Aggregate Size Available Used State Vols Nodes RAID Status aggr_b2 227 1GB 227 1GB 0 online 0 mcc1 a2 raid_dp mirrored normal 6 Heal the root aggregates by using the metrocluster heal phase root aggregates command mcc1A metrocluster heal phase root aggregates Job 137 Job succeeded Heal...

Page 97: ...roller module 1 If you are not already grounded properly ground yourself 2 Loosen the hook and loop strap binding the cables to the cable management device and then unplug the system cables and SFPs if needed from the controller module keeping track of where the cables were connected Leave the cables in the cable management device so that when you reinstall the cable management device the cables a...

Page 98: ...ration go to the next step If your system is in a stand alone configuration cleanly shut down the controller module and then check the NVRAM LED identified by the NV icon The NVRAM LED blinks while destaging contents to the flash memory when you halt the system After the destage is complete the LED turns off If power is lost without a clean shutdown the NVMEM LED flashes until the destage is compl...

Page 99: ...abs on the battery holder with the notches in the controller module side and then gently push down on the battery housing until the battery housing clicks into place 6 Close the CPU air duct Make sure that the plug locks down to the socket Step 4 Reinstall the controller After you replace a component within the controller module you must reinstall the controller module in the system chassis and bo...

Page 100: ...ne so reinstall the cable management device d Bind the cables to the cable management device with the hook and loop strap e As each node starts the booting press Ctrl C to interrupt the boot process when you see the message Press Ctrl C for Boot Menu f Select the option to boot to Maintenance mode from the displayed menu Step 5 Run system level diagnostics After installing a new NVMEM battery you ...

Page 101: ... failures a Clear the status logs sldiag device clearstatus b Verify that the log was cleared sldiag device status The following default response is displayed SLDIAG No log messages are present c Exit Maintenance mode halt The node displays the LOADER prompt d Boot the node from the LOADER prompt bye e Return the node to normal operation If your node is in Then An HA pair Perform a give back stora...

Page 102: ... by pressing Ctrl C when prompted to get to the Boot menu If you have two controller modules in the chassis fully seat the controller module you are servicing in the chassis The controller module boots up when fully seated If you have one controller module in the chassis connect the power supplies and then turn them on e Select Boot to maintenance mode from the menu f Exit Maintenance mode by ente...

Page 103: ...chback command from any node in the surviving cluster 5 Verify that the switchback operation has completed metrocluster show The switchback operation is still running when a cluster is in the waiting for switchback state cluster_B metrocluster show Cluster Configuration State Mode Local cluster_B configured switchover Remote cluster_A configured waiting for switchback The switchback operation is c...

Page 104: ... To shut down the impaired node you must determine the status of the node and if necessary take over the node so that the healthy node continues to serve data from the impaired node storage About this task If you have a cluster with more than two nodes it must be in quorum If the cluster is not in quorum or a healthy node shows false for eligibility and health you must correct the issue before shu...

Page 105: ... a healthy node shows false for eligibility and health you must correct the issue before shutting down the impaired node see the Administration overview with the CLI If you have a MetroCluster configuration you must have confirmed that the MetroCluster Configuration State is configured and that the nodes are in an enabled and normal state metrocluster node show Steps 1 If AutoSupport is enabled su...

Page 106: ... the CLI You must leave the power supplies turned on at the end of this procedure to provide power to the healthy node Steps 1 Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node metrocluster show 2 Depending on whether an automatic switchover has occurred proceed according to the following table If the impaired node Then Has aut...

Page 107: ...eck the state of the aggregates by using the storage aggregate show command controller_A_1 storage aggregate show Aggregate Size Available Used State Vols Nodes RAID Status aggr_b2 227 1GB 227 1GB 0 online 0 mcc1 a2 raid_dp mirrored normal 6 Heal the root aggregates by using the metrocluster heal phase root aggregates command mcc1A metrocluster heal phase root aggregates Job 137 Job succeeded Heal...

Page 108: ...oller module 1 If you are not already grounded properly ground yourself 2 Loosen the hook and loop strap binding the cables to the cable management device and then unplug the system cables and SFPs if needed from the controller module keeping track of where the cables were connected Leave the cables in the cable management device so that when you reinstall the cable management device the cables ar...

Page 109: ...upport the bottom of the controller module as you slide it out of the chassis Step 3 Replace a PCIe card To replace a PCIe card locate it within the controller and follow the specific sequence of steps 1 Loosen the thumbscrew on the controller module side panel 2 Swing the side panel off the controller module Side panel PCIe card 107 ...

Page 110: ...o the system Do not completely insert the controller module in the chassis until instructed to do so 2 Recable the system as needed If you removed the media converters QSFPs or SFPs remember to reinstall them if you are using fiber optic cables 3 Complete the reinstallation of the controller module The controller module begins to boot as soon as it is fully seated in the chassis If your system is ...

Page 111: ...rts convert these ports to 10 GbE connections by using the nicadmin convert command from Maintenance mode Be sure to exit Maintenance mode after completing the conversion 5 Return the node to normal operation If your system is in Issue this command from the partner s console An HA pair storage failover giveback ofnode impaired_node_name A two node MetroCluster configuration Proceed to the next ste...

Page 112: ...surviving cluster 5 Verify that the switchback operation has completed metrocluster show The switchback operation is still running when a cluster is in the waiting for switchback state cluster_B metrocluster show Cluster Configuration State Mode Local cluster_B configured switchover Remote cluster_A configured waiting for switchback The switchback operation is complete when the clusters are in the...

Page 113: ...acing one power supply at a time It is a best practice to replace the power supply within two minutes of removing it from the chassis The system continues to function but ONTAP sends messages to the console about the degraded power supply until the power supply is replaced The number of power supplies in the system depends on the model Power supplies are auto ranging 1 Identify the power supply yo...

Page 114: ...Power supply Cam handle release latch Power and Fault LEDs Cam handle 112 ...

Page 115: ... cable retainer Once power is restored to the power supply the status LED should be green 1 Turn on the power to the new power supply and then verify the operation of the power supply activity LEDs The power supply LEDs are lit when the power supply comes online 2 After you replace the part you can return the failed part to NetApp as described in the RMA instructions shipped with the kit Contact t...

Page 116: ...failover modify node local auto giveback false 3 Take the impaired node to the LOADER prompt If the impaired node is displaying Then The LOADER prompt Go to the next step Waiting for giveback Press Ctrl C and then respond y when prompted System prompt or password prompt enter system password Take over or halt the impaired node For an HA pair take over the impaired node from the healthy node storag...

Page 117: ...ord prompt enter system password Take over or halt the impaired node For an HA pair take over the impaired node from the healthy node storage failover takeover ofnode impaired_node_name When the impaired node shows Waiting for giveback press Ctrl C and then respond y Option 3 Controller is in a two node MetroCluster To shut down the impaired node you must determine the status of the node and if ne...

Page 118: ...ob succeeded Heal Aggregates is successful If the healing is vetoed you have the option of reissuing the metrocluster heal command with the override vetoes parameter If you use this optional parameter the system overrides any soft vetoes that prevent the healing operation 4 Verify that the operation has been completed by using the metrocluster operation show command controller_A_1 metrocluster ope...

Page 119: ...red controller module disconnect the power supplies Step 2 Open the controller module To access components inside the controller you must first remove the controller module from the system and then remove the cover on the controller module 1 If you are not already grounded properly ground yourself 2 Loosen the hook and loop strap binding the cables to the cable management device and then unplug th...

Page 120: ...ke sure that you support the bottom of the controller module as you slide it out of the chassis Step 3 Replace the RTC Battery To replace the RTC battery locate them inside the controller and follow the specific sequence of steps 1 If you are not already grounded properly ground yourself 2 Locate the RTC battery 118 ...

Page 121: ...atic shipping bag 5 Locate the empty battery holder in the controller module 6 Note the polarity of the RTC battery and then insert it into the holder by tilting the battery at an angle and pushing down 7 Visually inspect the battery to make sure that it is completely installed into the holder and that the polarity is correct Step 4 Reinstall the controller module and setting time date after RTC b...

Page 122: ... reinstall the cable management device c Bind the cables to the cable management device with the hook and loop strap d Reconnect the power cables to the power supplies and to the power sources and then turn on the power to start the boot process e Halt the controller at the LOADER prompt 6 Reset the time and date on the controller a Check the date and time on the healthy node with the show date co...

Page 123: ... 2 entries were displayed 2 Verify that resynchronization is complete on all SVMs metrocluster vserver show 3 Verify that any automatic LIF migrations being performed by the healing operations were completed successfully metrocluster check lif show 4 Perform the switchback by using the metrocluster switchback command from any node in the surviving cluster 5 Verify that the switchback operation has...

Page 124: ...eplication resync status show command 6 Reestablish any SnapMirror or SnapVault configurations Step 6 Return the failed part to NetApp After you replace the part you can return the failed part to NetApp as described in the RMA instructions shipped with the kit Contact technical support at NetApp Support 888 463 8277 North America 00 800 44 638277 Europe or 800 800 80 800 Asia Pacific if you need t...

Page 125: ...Y WHETHER IN CONTRACT STRICT LIABILITY OR TORT INCLUDING NEGLIGENCE OR OTHERWISE ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE NetApp reserves the right to change any products described herein at any time and without notice NetApp assumes no responsibility or liability arising from the use of products described herein except as expressly agree...

Reviews: