14
VMware
white paper
4. VMware Fault Tolerance Performance Summary
All Fault Tolerance solutions rely on redundancy. Additional CPU and memory resources are required to mirror the execution of a
running virtual machine instance. Also, some amount of CPU is required for recording, transferring, and replaying log events. The
amount of CPU required is mostly dependent on incoming I/O. If the primary virtual machine is constantly busy and resource
constraints at the secondary prohibit catching up, the primary virtual machine will be de-scheduled to allow the secondary to
catch up.
The round-trip network latency between the primary and the secondary hosts affects the I/O latency for disk writes and network
transmits. Impact on disk write operation, however, is minimal since the round trip latency is usually only on the order of a few
hundred microseconds, and disk I/O operations have latencies in milliseconds.
When there is sufficient CPU headroom for record/replay, and sufficient network bandwidth to handle the logging traffic, enabling
FT has very little impact on throughput. Real-life workloads exhibit very small, generally user imperceptible latency increase with
Fault Tolerance enabled.
5. Conclusion
VMware Fault Tolerance is a revolutionary new technology that VMware is introducing with vSphere. The architecture and design of
VMware vLockstep technology allows hardware-style Fault Tolerance on single-CPU virtual machines with minimal impact to
performance. Experiments with a wide variety of synthetic and real-life workloads show that the performance impact on throughput
and latency is small. These experiments also demonstrate that a Gigabit link is sufficient for even the most demanding workloads.