
more than 1 bit to each ECC word. Therefore, the only way to get a multiple-bit memory error
from SDRAMs is if more than one SDRAM failed at the same time (rare event). The system is also
resilient to any cosmic ray or alpha particle strike because these failure modes can only affect
multiple bits in a single SDRAM. If a location in memory is "bad", the physical page is deallocated
dynamically and is replaced with a new page without any OS or application interruption. In
addition, a combination of hardware and software scrubbing is used for memory. The software
scrubber reads/writes all memory locations periodically. However, it does not have access to
"locked down" pages. Therefore, a hardware memory scrubber is provided for full coverage.
Finally data is protected by providing address/control parity protection.
Memory DRAM fault tolerance, i.e. recovery of a single SDRAM failure
DIMM address / control parity protection
Dynamic memory resilience, i.e. page de allocation of bad memory pages during
operation.
NOTE:
NOTE:
NOTE:
NOTE: Dynamic memory resilience is not supported when running Windows Server 2003,
SUSE SLES 9 or Red Hat RHEL AS 3 in the partition.
Hardware and software memory scrubbing
Redundant DC conversion
Cell ICAP.
NOTE:
NOTE:
NOTE:
NOTE: Cell ICAP is not supported when Windows Server 2003 SUSE SLES 9 or Red Hat
RHEL AS 3 is running in the partition.
I/O
I/O
I/O
I/O: Partitions configured with dual path I/O can be configured to have no shared components
between them, thus preventing I/O cards from creating faults on other I/O paths. I/O cards in
hardware partitions (nPars) are fully isolated from I/O cards in other hard partitions. It is not
possible for an I/O failure to propagate across hard partitions. It is possible to dynamically repair
and add I/O cards to an existing running partition.
Full single-wire error detection and correction on I/O links
I/O cards fully isolated from each other
HW for the Prevention of silent corruption of data going to I/O
On-line addition/replacement (OLAR) for individual I/O cards, some external peripherals,
SUB/HUB.
NOTE:
NOTE:
NOTE:
NOTE: Online addition/replacement (OLAR) is not supported when running Red Hat RHEL
AS 3, or SUSE SLES 9 in the partition.
Parity protected I/O paths
Dual path I/O
Crossbar and Cabinet Infrastructure
Crossbar and Cabinet Infrastructure
Crossbar and Cabinet Infrastructure
Crossbar and Cabinet Infrastructure:
Recovery of a single crossbar wire failure
Localization of crossbar failures to the partitions using the link
Automatic de-allocation of bad crossbar link upon boot
Redundant and hotswap DC converters for the crossbar backplane
ASIC full burn-in and "high quality" production process
Full "test to failure" and accelerated life testing on all critical assemblies
Strong emphasis on quality for multiple-nPartition single points of failure (SPOFs)
System resilience to Management Processor (MP)
Isolation of nPartition failure
Protection of nPartitions against spurious interrupts or memory corruption
Hot swap redundant fans (main and I/O) and power supplies (main and backplane power
bricks)
Dual power source
Phone-Home capability
"HA Cluster-In-A-Box" Configuration
"HA Cluster-In-A-Box" Configuration
"HA Cluster-In-A-Box" Configuration
"HA Cluster-In-A-Box" Configuration: The "HA Cluster-In-A-Box" allows for failover of users'
applications between hardware partitions (nPars) on a single Superdome system. All providers of
mission critical solutions agree that failover between clustered systems provides the safest
availability-no single points of failures (SPOFs) and no ability to propagate failures between
systems. However, HP supports the configuration of HA cluster software in a single system to allow
QuickSpecs
HP Integrity Superdome Servers 16-socket, 32-socket,
HP Integrity Superdome Servers 16-socket, 32-socket,
HP Integrity Superdome Servers 16-socket, 32-socket,
HP Integrity Superdome Servers 16-socket, 32-socket,
and 64-socket
and 64-socket
and 64-socket
and 64-socket
Configuration
DA - 11717 Worldwide — Version 20 — May 3, 2005
Page 12