Chapter 4. Continuous availability and manageability
163
Figure 4-3 shows this option using the ASMI.
Figure 4-3 ASMI Auto Power Restart setting panel
Partition availability priority
Also available is the ability to assign availability priorities to partitions. If an alternate
processor recovery event requires spare processor resources to protect a workload, when
no other means of obtaining the spare resources is available, the system will determine
which partition has the lowest priority and attempt to claim the needed resource. On a
properly configured processor-based server, this allows that capacity to be first
obtained from, for example, a test partition instead of a financial accounting system.
cache availability
The L2 and L3 caches in the processor are protected with double-bit detect,
single-bit correct error detection code (ECC). In addition, the caches maintain a cache line
delete capability. A threshold of correctable errors detected on a cache line can result in
the data in the cache line being purged and the cache line removed from further operation
without requiring a reboot. An ECC uncorrectable error detected in the cache can also
trigger a purge and delete of the cache line. This results in no loss of operation if the cache
line contained data unmodified from what was stored in system memory. Modified data
would be handled through Special Uncorrectable Error handling. L1 data and instruction
caches also have a retry capability for intermittent error and a cache set delete
mechanism for handling solid failures. In addition, the processors also have
the ability to dynamically substitute a faulty bit-line in an L3 cache dedicated to a
processor with a spare bit-line.
Fault monitoring
Built-in self-test (BIST) checks processor, cache, memory, and associated hardware that
is required for proper booting of the operating system, when the system is powered on at
the initial installation or after a hardware configuration change (for example, an upgrade).
If a non-critical error is detected or if the error occurs in a resource that can be removed
from the system configuration, the booting process is designed to proceed to completion.
The errors are logged in the system nonvolatile random access memory (NVRAM). When
the operating system completes booting, the information is passed from the NVRAM to the
Содержание Power 720 Express
Страница 2: ......
Страница 14: ...xii IBM Power 720 and 740 Technical Overview and Introduction ...
Страница 128: ...114 IBM Power 720 and 740 Technical Overview and Introduction ...
Страница 204: ...190 IBM Power 720 and 740 Technical Overview and Introduction ...
Страница 205: ......