www.nvidia.com
NVIDIA DGX-1
DU-08033-001 _v13.1 | iii
4.1.2. Viewing System Information......................................................................29
4.1.3. Submitting BMC Log Files.........................................................................29
4.1.4. Determining Total Power Consumption......................................................... 29
4.1.5. Accessing the DGX-1 Console.................................................................... 30
4.1.6. Powering Off / Power Cycling the System Remotely.........................................30
4.1.6.1. From the DGX-1 Console Window..........................................................30
4.1.6.2. From the BMC UI............................................................................. 30
4.2. Configuring a Static IP Address for the BMC........................................................31
4.2.1. Configuring a BMC Static IP Address Using ipmitool..........................................31
4.2.2. Configuring a BMC Static IP Address Using the System BIOS................................ 32
4.2.3. Configuring a BMC Static IP Address Using the BMC Dashboard............................ 36
4.3. Configuring Static IP Addresses for the Network Ports............................................37
4.4. Obtaining MAC Addresses.............................................................................. 38
Chapter 5. Maintaining and Servicing the NVIDIA DGX-1............................................... 42
5.1. Problem Resolution and Customer Care............................................................. 42
5.2. Restoring the DGX-1 Software Image................................................................ 42
5.2.1. Obtaining the DGX-1 Software ISO Image and Checksum File.............................. 43
5.2.2. Re-Imaging the System Remotely............................................................... 43
5.2.3. Creating a Bootable Installation Medium...................................................... 46
5.2.3.1. Creating a Bootable USB Flash Drive by Using the dd Command......................46
5.2.3.2. Creating a Bootable USB Flash Drive by Using Akeo Rufus............................. 47
5.2.4. Re-Imaging the System From a USB Flash Drive.............................................. 49
5.2.5. Retaining the RAID Partition While Installing the OS.........................................49
5.3. Updating the System BIOS............................................................................. 50
5.4. Updating the BMC....................................................................................... 53
5.5. Replacing the System and Components..............................................................55
5.5.1. Replacing the System............................................................................. 56
5.5.2. Replacing an SSD...................................................................................56
5.5.3. Recreating the Virtual Drives.................................................................... 57
5.5.3.1. Access the BIOS Setup Utility.............................................................. 57
5.5.3.2. Clear the Drive Group Configuration...................................................... 60
5.5.3.3. Recreate the OS Virtual Drive.............................................................. 64
5.5.3.4. Recreate the RAID0 Virtual Drive.......................................................... 72
5.5.4. Recreating the RAID 0 Array..................................................................... 84
5.5.5. Replacing the Power Supplies....................................................................85
5.5.6. Replacing the Fan Module........................................................................ 86
5.5.7. Replacing the DIMMs...............................................................................86
5.5.8. Replacing the InfiniBand Cards.................................................................. 91
5.5.9. Setting Up the InfiniBand Cards................................................................. 95
Chapter 6. Installing Software on Air-Gapped NVIDIA DGX-1 Systems............................... 99
6.1. Installing NVIDIA DGX-1 Software.....................................................................99
6.1.1. Re-Imaging the System............................................................................99
6.1.2. Creating a Local Mirror of the NVIDIA and Canonical Repositories....................... 100