This soft copy for use by IBM employees only.
node with the next highest IP address (and so on), until the ring is completed by
the node with the lowest IP address
pinging to the Control Workstation.
If there is no response to several retries of the
ping packet within the timeout
period, that heartbeat daemon will notify the
group leader that connectivity has
been lost over
en0, which will then take that node out of the ring.
The
group leader will notify all the nodes of this action so that the ring can be
re-established by missing out the node from which there is no response over
en0. The heartbeat daemon on the Control Workstation will pass the information
about this node up to the host_responds daemon on the Control Workstation,
which in turn passes it up to the SDR daemon to update the
host_responds
object class of the SDR.
The
spmon
command uses this object to find out the
host_responds status of all
the nodes. It may take a few seconds before the loss of connectivity is reflected
by the node changing from green to red on the
spmon
GUI due to the timeout
period mentioned before and the time taken for spmon to again read the object
class and refresh the
host_responds GUI.
The heartbeat daemon on the Control Workstation will send out regular
ping
packets (known as
proclaim packets) to any nodes that have been marked as
missing from the ring in the
host_responds object class, to test whether
connectivity has returned, and will reintegrate them back into the ring if that is
the case.
If a node goes
red on the
spmon
GUI, run the following command:
# SDRGetObjects host_responds
Here is an example of the output:
node_number
host_responds
1
1
3
1
5
1
6
1
7
1
8
1
9
1
10
1
11
1
12
0
In this example, all the nodes have
en0 connectivity except node 12, which has
an entry of
0 (signifying a loss of connectivity).
In such a case, first test the connectivity manually by
pinging the en0 interface.
If the
ping is successful, wait a few seconds to see if
spmon
will refresh. If it does
not, then check that the heartbeat, SDR, and heartbeat daemons are running on
the Control Workstation, and that the heartbeat daemon is running on the
problematic node.
Examples have already been given for heartbeat and SDR, but following is an
example for the host_responds daemon. It shows how to find out if it is running,
the inittab entries, and how to start it if it has died:
# ps -ef | grep hr
Following is an example of the process name output from this command:
Chapter 5. System Partitioning
153
Содержание RS/6000 SP
Страница 2: ......
Страница 14: ...This soft copy for use by IBM employees only xii SP PD Guide...
Страница 16: ...This soft copy for use by IBM employees only xiv SP PD Guide...
Страница 106: ...This soft copy for use by IBM employees only 86 SP PD Guide...
Страница 178: ...This soft copy for use by IBM employees only 158 SP PD Guide...
Страница 214: ...This soft copy for use by IBM employees only 194 SP PD Guide...
Страница 248: ...This soft copy for use by IBM employees only 228 SP PD Guide...
Страница 250: ...This soft copy for use by IBM employees only Figure 102 setup_authent Script Flow Chart 2 7 230 SP PD Guide...
Страница 252: ...This soft copy for use by IBM employees only Figure 104 setup_authent Script Flow Chart 4 7 232 SP PD Guide...
Страница 254: ...This soft copy for use by IBM employees only Figure 106 setup_authent Script Flow Chart 6 7 234 SP PD Guide...
Страница 258: ...This soft copy for use by IBM employees only Figure 110 install_cw Script Flow Chart 3 3 238 SP PD Guide...
Страница 260: ...This soft copy for use by IBM employees only Figure 112 setup_server Script Flow Chart 2 23 240 SP PD Guide...
Страница 262: ...This soft copy for use by IBM employees only Figure 114 setup_server Script Flow Chart 4 23 242 SP PD Guide...
Страница 264: ...This soft copy for use by IBM employees only Figure 116 setup_server Script Flow Chart 6 23 244 SP PD Guide...
Страница 266: ...This soft copy for use by IBM employees only Figure 118 setup_server Script Flow Chart 8 23 246 SP PD Guide...
Страница 268: ...This soft copy for use by IBM employees only Figure 120 setup_server Script Flow Chart 10 23 248 SP PD Guide...
Страница 270: ...This soft copy for use by IBM employees only Figure 122 setup_server Script Flow Chart 12 23 250 SP PD Guide...
Страница 272: ...This soft copy for use by IBM employees only Figure 124 setup_server Script Flow Chart 14 23 252 SP PD Guide...
Страница 274: ...This soft copy for use by IBM employees only Figure 126 setup_server Script Flow Chart 16 23 254 SP PD Guide...
Страница 276: ...This soft copy for use by IBM employees only Figure 128 setup_server Script Flow Chart 18 23 256 SP PD Guide...
Страница 278: ...This soft copy for use by IBM employees only Figure 130 setup_server Script Flow Chart 20 23 258 SP PD Guide...
Страница 280: ...This soft copy for use by IBM employees only Figure 132 setup_server Script Flow Chart 22 23 260 SP PD Guide...
Страница 284: ...This soft copy for use by IBM employees only Figure 136 rc switch Script Flow Chart 3 8 264 SP PD Guide...
Страница 286: ...This soft copy for use by IBM employees only Figure 138 rc switch Script Flow Chart 5 8 266 SP PD Guide...
Страница 288: ...This soft copy for use by IBM employees only Figure 140 rc switch Script Flow Chart 7 8 268 SP PD Guide...
Страница 290: ...This soft copy for use by IBM employees only 270 SP PD Guide...
Страница 292: ...This soft copy for use by IBM employees only 272 SP PD Guide...
Страница 300: ...This soft copy for use by IBM employees only 280 SP PD Guide...
Страница 304: ...This soft copy for use by IBM employees only 284 SP PD Guide...
Страница 308: ...This soft copy for use by IBM employees only 288 SP PD Guide...
Страница 310: ...This soft copy for use by IBM employees only 290 SP PD Guide...
Страница 316: ...IBML This soft copy for use by IBM employees only Printed in U S A SG24 4778 00...