168
IBM Power 770 and 780 (9117-MMD, 9179-MHD) Technical Overview and Introduction
Finally, if an uncorrectable error in memory is discovered, the logical memory block that is
associated with the address with the uncorrectable error is marked for deallocation by the
POWER Hypervisor. This deallocation takes effect on a partition reboot if the logical memory
block is assigned to an active partition at the time of the fault.
In addition, the system deallocates the entire memory group that is associated with the error
on all subsequent system reboots until the memory is repaired. This way is intended to guard
against future uncorrectable errors while waiting for parts replacement.
Memory persistent deallocation
Defective memory that is discovered at boot time is automatically switched off. If the service
processor detects a memory fault at boot time, it marks the affected memory as bad so that it
is not to be used on subsequent reboots.
If the service processor identifies faulty memory in a server that includes CoD memory, the
POWER Hypervisor attempts to replace the faulty memory with available CoD memory. Faulty
resources are marked as deallocated, and working resources are included in the active
memory space. Because these activities reduce the amount of CoD memory available for
future use, schedule repair of the faulty memory as soon as convenient.
Upon reboot, if not enough memory is available to meet minimum partition requirements, the
POWER Hypervisor reduces the capacity of one or more partitions.
Depending on the configuration of the system, the IBM HMC Service Focal Point™, the OS
Service Focal Point, or the service processor receives a notification of the failed component
and triggers a service call.
4.2.4 Active Memory Mirroring for Hypervisor
Active Memory Mirroring (AMM) for Hypervisor is a hardware and firmware function of
Power 770 and Power 780 systems that provides the ability of the POWER7 and
chip to create two copies of data in memory. Having two copies eliminates a system-wide
outage because of an uncorrectable failure of a single DIMM in the main memory used by the
hypervisor (also called System firmware). This capability is standard and enabled by default
on the Power 780 server. On the Power 770 it is an optional chargeable feature.
Handling failures: Memory page deallocation handles single cell failures, but because of
the size of data in a data bit line, it might be inadequate for handling more catastrophic
failures.
Summary of Contents for Power 780
Page 2: ......
Page 14: ...xii IBM Power 770 and 780 9117 MMD 9179 MHD Technical Overview and Introduction...
Page 134: ...120 IBM Power 770 and 780 9117 MMD 9179 MHD Technical Overview and Introduction...
Page 172: ...158 IBM Power 770 and 780 9117 MMD 9179 MHD Technical Overview and Introduction...
Page 218: ...204 IBM Power 770 and 780 9117 MMD 9179 MHD Technical Overview and Introduction...
Page 219: ......