Relion 1900e/2900e Manual
Revision 1.0
62
7.2.3.2
Fault Resilient Booting (FRB)
Fault resilient booting (FRB) is a set of BIOS and BMC algorithms and hardware support that allow a
multiprocessor system to boot even if the bootstrap processor (BSP) fails. Only FRB2 is supported using
watchdog timer commands.
FRB2 refers to the FRB algorithm that detects system failures during POST. The BIOS uses the BMC
watchdog timer to back up its operation during POST. The BIOS configures the watchdog timer to indicate
that the BIOS is using the timer for the FRB2 phase of the boot operation.
After the BIOS has identified and saved the BSP information, it sets the FRB2 timer use bit and loads the
watchdog timer with the new timeout interval.
If the watchdog timer expires while the watchdog use bit is set to FRB2, the BMC (if so configured) logs a
watchdog expiration event showing the FRB2 timeout in the event data bytes. The BMC then hard resets the
system, assuming the BIOS-selected reset as the watchdog timeout action.
The BIOS is responsible for disabling the FRB2 timeout before initiating the option ROM scan and before
displaying a request for a boot password. If the processor fails and causes an FRB2 timeout, the BMC resets
the system.
The BIOS gets the watchdog expiration status from the BMC. If the status shows an expired FRB2 timer, the
BIOS enters the failure in the system event log (SEL). In the OEM bytes entry in the SEL, the last POST code
generated during the previous boot attempt is written. FRB2 failure is not reflected in the processor status
sensor value.
The FRB2 failure does not affect the front panel LEDs.
7.2.3.3
Post Code Display
The BMC, upon receiving standby power, initializes internal hardware to monitor port 80h (POST code)
writes. Data written to port 80h is output to the system POST LEDs. The BMC deactivates POST LEDs after
POST had completed. Refer to Appendix D for a complete list of supported POST Code Diagnostic LEDs.
7.2.4
Watchdog Timer
The BMC implements a fully IPMI 2.0-compatible watchdog timer. For details, see the
Intelligent Platform
Management Interface Specification Second Generation v2.0
. The NMI/diagnostic interrupt for an IPMI 2.0
watchdog timer is associated with an NMI. A watchdog pre-timeout SMI or equivalent signal assertion is not
supported.
7.2.5
System Event Log (SEL)
The BMC implements the system event log as specified in the
Intelligent Platform Management Interface
Specification,
Version 2.0
.
The SEL is accessible regardless of the system power state through the BMC's in-
band and out-of-band interfaces.
The BMC allocates 95231 bytes (approximately 93 KB) of non-volatile storage space to store system events.
The SEL timestamps may not be in order. Up to 3,639 SEL records can be stored at a time. Because the SEL is
circular, any command that results in an overflow of the SEL beyond the allocated space will overwrite the
oldest entries in the SEL, while setting the overflow flag.
7.3
Sensor Monitoring
The BMC monitors system hardware and reports system health. The information gathered from physical
sensors is translated into IPMI sensors as part of the “IPMI Sensor Model”. The BMC also reports various
Summary of Contents for Relion 1900e
Page 2: ...Relion 1900e 2900e Manual Revision 1 3 April 2016 Intel Server Boards and Systems...
Page 11: ...Relion 1900e 2900e Manual x Revision 1 3 Figure 36 Relion 1900e 149 Figure 37 Relion 2900e 152...
Page 14: ...Relion 1900e 2900e Manual Revision 1 3 xiii This page is intentionally left blank...
Page 15: ......