1. Verify that the GPU is seated in the PCIe slot correctly by reseating the GPU. Then power cycle
the system.
2. Verify that the power connectors to the GPU are connected firmly. Then power cycle the system.
3. Run nvidia-smi -q In some cases this will report a poorly connected power cable.
4. Rerun the diagnostics, using the same GPU, on system that is known to be working. A variety of
system issues can cause diagnostic failure.
5. If the problem remains, contact your IBM technical-support representative.
Related links
– Lenovo Support website
– Latest level of DSA
•
409-905-000 : Nvidia::DiagnosticServiceProvider::Matrix Test Failed
Nvidia GPU Matrix Test Failed.
Recoverable
No
Severity
Error
Serviceable
Yes
Automatically notify support
No
User Response
Complete the following steps:
1. Verify that the GPU is seated in the PCIe slot correctly by reseating the GPU. Then power cycle
the system.
2. Verify that the power connectors to the GPU are connected firmly. Then power cycle the system.
3. Run nvidia-smi -q In some cases this will report a poorly connected power cable.
4. Rerun the diagnostics, using the same GPU, on system that is known to be working. A variety of
system issues can cause diagnostic failure.
5. If the problem remains, contact your IBM technical-support representative.
Related links
– Lenovo Support website
– Latest level of DSA
•
409-906-000 : Nvidia::DiagnosticServiceProvider::Binomial Test Failed
Nvidia GPU Binomial Test Failed.
Recoverable
No
Severity
Error
Serviceable
Yes
Automatically notify support
No
User Response
Complete the following steps:
Appendix C. DSA diagnostic test results
823
Summary of Contents for NeXtScale n1200
Page 115: ...Chapter 6 Removing and replacing server components 101 ...
Page 117: ...Chapter 6 Removing and replacing server components 103 ...
Page 119: ...Chapter 6 Removing and replacing server components 105 ...
Page 135: ...Chapter 6 Removing and replacing server components 121 ...
Page 137: ...Chapter 6 Removing and replacing server components 123 ...
Page 139: ...Chapter 6 Removing and replacing server components 125 ...
Page 869: ...Taiwan Class A compliance statement Appendix E Notices 855 ...
Page 877: ......
Page 878: ...Part Number SP47A31725 Printed in China 1P P N SP47A31725 1PSP47A31725 ...