
Inventory Management
HSS keeps track of hardware inventory to identify which hardware components are up and running. It uses the
xthwinv
command to retrieve hardware component information.
Hardware Discovery
HSS plays an integral role in system hardware discovery. HSS components that play a role in this process include
the HSS database and the
xtdiscover
command. For more information, see the
xtdiscover(8)
man page.
Hardware Supervisory System (HSS) Ethernet/Management Network
The HSS network provides interconnectivity between the System Management Workstation (SMW), Rack
Controllers (RCs), and dual Aries Network Cards (dANCs) in a hierarchical fashion.
3.4.2
The xtdiscover Command
The
xtdiscover
command automatically discovers the hardware components on a Cray system and creates
entries in the system database to reflect the current hardware configuration. The
xtdiscover
command
identifies missing or non-responsive cabinets and empty or non-functioning Dual Aries Network Cards (dANCs).
The
xtdiscover
command and the state manager ensure that the system status represents the real state of the
hardware. When
xtdiscover
has finished, a system administrator can use the
xtcli
command to display the
current configuration. No previous configuration of the system is required; the hardware is discovered and made
available. Modifications can be made to components after
xtdiscover
has finished creating entries in the
system database.
The
xtdiscover
interface steps a system administrator through the discovery process.
Prior to performing component discovery, the
xtdiscover
command will need to make sure that the Hardware
Supervisory System (HSS) networking is set up properly, using a user-provided block of IP address space. This
information is used to create the
/etc/hosts
file and DHCP entries for the HSS network. This setup typically
only needs to be done once unless the address block is moved, or a new rack is added.
TIP: Simply adding an additional rack within an existing address block will not affect the address
assignments for the existing racks. If it is intended to add additional racks in the future, it is better to
configure networking for all of them all at once. The
xtdiscover
command will automatically detect
whether each rack is presently in the system and will set the system state accordingly.
If there are changes to the system hardware, such as populating a previously empty dANC, or adding an
additional rack, then
xtdiscover
must be executed again, and it will perform an incremental discovery of the
hardware changes. A full-system
xtdiscover
is not intended to be run while the High Speed Network (HSN) is
actively routing traffic. When new blades are added during system operation with
xtwarmswap
, however, a mini-
xtdiscover
is automatically run to make the required updates to the database.
For more information, see the
xtdiscover(8)
man page.
3.4.3
Hardware Supervisory System (HSS) Component Location Discovery
Each Urika
®
-GX system rack is numbered starting at
0
. Each sub-rack within a rack has a dip switch that can set
the rack and sub rack number. The iSCB conveys the rack and sub-rack numbers to the Aries Network Cards
(ANCs) via an I
2
C bus. The Dual Aries Network Card (dANC) blade has a slot-sense bit which tells it which dual
dANC number it is within the sub-rack (
0
or
1
). The dANC uses the rack, sub-rack, and dANC number to
construct its hostname. The Rack Controller (RC) determines its rack number from the location of the Intelligent
Subrack Control Board (iSCB), encoded in a DHCP request sent by the iSCB and seen by RC.
System Management
S3016
25