SIERRA VIDEO SYSTEMS
24
If the standby processor has determined that it is “sick” (it is flashing its LEDd), it will refuse to
take on the job of master processor, even if the other processor is not present. A processor can
only be master processor when it thinks itself is okay.
Therefore, the master processor will never flash LEDd. If the master processor thinks it is sick, it
will relinquish control to the standby processor. The standby processor will normally never flash
LEDf, but will instead take over as master processor if it detects that the master processor is
having problems. However, if the standby processor also thinks that it is sick, it will flash both
LEDd and LEDf, and will not take over as master processor. If both processors are sick, they can
both remain in a standby state, in which case the router will be inoperative.
When the standby processor takes over, all control outputs (in particular, crosspoint control and
serial port transmit data) are switched over to the new processor. The crosspoint data is then
sent to the crosspoint matrices to make sure that the actual crosspoint state matches the new
master processor state.
Normally, this will cause no change in the crosspoint state. However, if the new master had not
yet completed synchronization with the old master before it took over control, it is possible that
some crosspoints will change state. The new master processor also sends the state of all
crosspoints to all control panels but not to ports running host protocol. Every port that is running
host protocol will receive a “G REDUNDANT_PROCESSOR” command, with the <new_master>
argument set to 1. Host application software, upon receiving such a command, should assume
that any data it has cached from the previous master may be invalid. This includes crosspoint
data, panel configuration data, and any other data. It should request the data again from the new
master processor.
At takeover, if synchronization had not been completed, the new master also resets all control
panels, because that is the only way it can find out what types of control panels are present.
Finally, the new master begins normal operation.
With this activity taking place at takeover, users may notice a period of a few seconds during
which control panels and host and terminal commands have a delayed response.
At takeover the master processor’s
terminal protoco
l displays a message to the user to indicate
that
a new master processor is in control, and
is pressed, the router configuration screen is
displayed. This serves as a clear notification to the user that a takeover has occurred.
Periodic Testing of the Standby Processor
Although the LEDS indicate whether or not a processor has detected a problem with itself, it is
still possible that the processor can be unhealthy without it being detected. For example, a
processor might have a burned out control line for controlling the crosspoints, and it has no way
to detect this situation.
Unless the customer takes steps to periodically test the standby processor, he can never be sure
that it is fully functional and ready to take over if the master processor goes down.
The only way to reliably test the standby processor is to force it to become the master processor.
This can be done with a host protocol command, or simply by using the terminal interface and the
“T” screen described above. As mentioned earlier, a switch from one processor to the other can
cause a temporary (several seconds) slowdown in router response. Therefore, switching should
be done at a time when it will not inconvenience users. It is not necessary to switch back again
after making sure the router continues to function properly with the other processor.
Instead, you can simply switch processors every so often, say once a month. Half the time, the
preferred master is in control, and half the time the preferred standby processor is in control.