This soft copy for use by IBM employees only.
because port 2 corresponds to bit 2. If a DO was found on port 5, the value
will be 20, because port 5 corresponds to bit 5. If you are unfamiliar with
such notation, you may find it useful to translate the hexadecimal values to
binary first. Then you can find which bit is on and find the corresponding
port that matches it: simply count from right to left, starting at 0.
•
It is not important to recognize whether the error was a TO, a TE, a DO, so
on. This information is left in for historical reasons and may be useful to
development, but not to making a FRU call.
•
For adapters, indicated as
VDC, id X
, the VDC_FLT and MSTAT are important
pieces of information.
Master/Slave Miscompare (Only HiPS)
Knowing whether a master/slave miscompare has occurred between a switch
chip pair is critical in making a FRU call on faults reported by switch chips.
Recall that each switch chip function is actually performed by two physical chips.
Both chips are taking input from the rest of the system. The master chip is
providing the outputs to the rest of the system. The slave chip is using the
inputs from the rest of the system and checking the outputs from the master to
see if they make sense. When they do not make sense, an error is flagged.
Although this is an expensive way of checking for errors, it is quite thorough,
because in order to let an error escape, both chips would have to break in the
same way at the same time.
In the flt file, the master is noted as “chip 0” and the slave is “chip 1.” For
example,
Switch, id 10036, chip 0 time = 006ef91f5a65
is a report from master
chip 10036, while
Switch, id 10036, chip 0 & 1 time=006ef91fa65
indicates that
both the master and slave chip saw the same thing.
If the master and slave report different errors, then this is known as a
master/slave miscompare. This indicates that they each saw something different.
If a particular port is reported in error, by either the master or slave, then there
is a good chance that the problem is either with the board or the cable. The
cable may be the problem, because it may be attenuating the signal such that it
reaches a voltage in no-man
′
s land and one may decide it is a zero and the
other may decide that it is a one, and there is your master/slave miscompare.
If the chips cannot narrow things down to a port, the odds increase that the
problem is with either the master or slave switch chip.
If the master and slave report the same error, then it is more likely that the
problem is on the other side of the link/cable, or it is the cable.
If the port called is not on board (port 0 through 3), you should always check the
cable and connectors for bent pins.
If an onboard port (port 4 through 7) is called, the FRU is the switch assembly
that houses the chip that is reporting the fault.
122
SP PD Guide
Summary of Contents for RS/6000 SP
Page 2: ......
Page 14: ...This soft copy for use by IBM employees only xii SP PD Guide...
Page 16: ...This soft copy for use by IBM employees only xiv SP PD Guide...
Page 106: ...This soft copy for use by IBM employees only 86 SP PD Guide...
Page 178: ...This soft copy for use by IBM employees only 158 SP PD Guide...
Page 214: ...This soft copy for use by IBM employees only 194 SP PD Guide...
Page 248: ...This soft copy for use by IBM employees only 228 SP PD Guide...
Page 290: ...This soft copy for use by IBM employees only 270 SP PD Guide...
Page 292: ...This soft copy for use by IBM employees only 272 SP PD Guide...
Page 300: ...This soft copy for use by IBM employees only 280 SP PD Guide...
Page 304: ...This soft copy for use by IBM employees only 284 SP PD Guide...
Page 308: ...This soft copy for use by IBM employees only 288 SP PD Guide...
Page 310: ...This soft copy for use by IBM employees only 290 SP PD Guide...
Page 316: ...IBML This soft copy for use by IBM employees only Printed in U S A SG24 4778 00...