VMware vSphere 4 Скачать руководство пользователя страница 7

7

VMware

 white paper

It is recommended that FT primary virtual machines be distributed across multiple hosts and, as a general rule of thumb, the number 
of FT virtual machines be limited to four per host. In addition to avoiding the possibility of saturating the network link, it also reduces 
the number of simultaneous live migrations required to create new secondary virtual machines in the event of a host failure. 

2.8. DrS and VMotion

DRS takes into account the additional CPU and memory resources used by the secondary virtual machine in the cluster, but  
DRS does not migrate FT enabled virtual machines to load balance the cluster. If either the primary or secondary dies, a new  
secondary is spawned and is placed on the candidate host determined by HA. The candidate host determined by HA may not be an 
optimal placement for balancing, however one can manually VMotion either the primary or the secondary virtual machines to a  
different host as needed.

2.9. Timer Interrupts

Though timer interrupts do not significantly impact FT performance, all timer interrupt events must be recorded at the primary  
and replayed at the secondary. This means that having a lower timer interrupt rate results in a lower volume of FT logging traffic.  
The following table illustrates this.

Guest OS

Timer interrupt rate

Idle VM FT traffic

RHEL 5.0 64-bit

1000 Hz

1.43 Mbits/sec

SLES 10 SP2 32-bit

250 Hz

0.68 Mbits/sec

Windows 2003 Datacenter Edition

82 Hz

0.15 Mbits/sec

Where possible, lowering the timer interrupt rate is recommended. See KB article 1005802 for more information on how to reduce 
timer interrupt rates for Linux guest operating systems.

2.10. Fault Tolerance Logging Bandwidth Sizing Guideline

As described in section 1.2, FT logging network traffic depends on the number of non-deterministic events and external inputs that 
need to be recorded at the primary virtual machine. Since the majority of this traffic usually consists of incoming network packets 
and disk reads, it is possible to estimate the amount of FT logging network bandwidth (in Mbits/sec) required for the virtual machine 
using the following formula:

FT logging bandwidth ~= [ (Average disk read throughput in Mbytes/sec * 8) + Average network receives (Mbits/sec) ] * 1.2 

In addition to the inputs to the virtual machine, this formula reserves 20 percent additional networking bandwidth for recording  
non-deterministic CPU events and for the TCP/IP headers.

3. Fault Tolerance Performance

This section discusses the performance characteristics of Fault Tolerant virtual machines using a variety of micro-benchmarks and 
real-life workloads. Micro-benchmarks were used to stress CPU, disk, and network subsystems individually by driving them to  
saturation. Real life workloads, on the other hand, have been chosen to be representative of what most customers would run and 
they have been configured to have a CPU utilization of 60 percent in steady state. Identical hardware test beds were used for all the 
experiments, and the performance comparison was done by running the same workload on the same virtual machine with and 
without FT enabled. The hardware and experimental setup details are provided in the Appendix. For each experiment, the traffic on 

the FT logging NIC during the steady state portion of the workload is also provided as a reference. 

3.1. SPeCjbb2005

SPECjbb2005 is an industry standard benchmark that measures Java application performance with particular stress on CPU and 

memory. The workload is memory intensive and saturates the CPU but does little I/O. Because this workload saturates the CPU and 
generates little logging traffic, its FT performance is dependent on how well the secondary can keep pace with the primary. 

 

Содержание vSphere 4

Страница 1: ...VMware vSphere 4 Fault Tolerance Architecture and Performance W H I T E P A P E R ...

Страница 2: ... DRS and VMotion 7 2 9 Timer Interrupts 7 2 10 Fault Tolerance Logging Bandwidth Sizing Guideline 7 3 Fault Tolerance Performance 7 3 1 SPECjbb2005 7 3 2 Kernel Compile 8 3 3 Netperf Throughput 9 3 4 Netperf Latency Bound Case 9 3 5 Filebench Random Disk Read Write 10 3 6 Oracle 11g 11 3 7 Microsoft SQL Server 2005 12 3 8 Microsoft Exchange Server 2007 13 4 VMware Fault Tolerance Performance Summa...

Страница 3: ...required VMware collaborated with AMD and Intel to make sure all currently shipping Intel and AMD server processors support these changes See KB article 1008027 for a list of supported processors VMware currently supports record replay only for uniprocessor virtual machines Record Replay of symmetric multi processing SMP virtual machines is more challenging because in addition to recording all ext...

Страница 4: ...ary The physical time lag between the primary and secondary virtual machine execution is denoted as the vLockstep interval in the FT summary status page Figure 3 vLockstep Interval in the FT Summary Status Page The vLockstep interval is calculated as a moving average and it assumes that the round trip network latency between the primary and secondary hosts is constant The vLockstep interval will i...

Страница 5: ...ch guest OS and CPU combination requires power on off operations for changes to take effect Enable FT operation enables Fault Tolerance by live migrating the virtual machine to another host to create a secondary virtual machine Since live migration is a resource intensive operation limiting the frequency of enable disable FT operations is recommended The secondary virtual machine uses additional r...

Страница 6: ...the primary host is not able to send traffic to the secondary i e when the TCP window is full then the primary virtual machine will make little or no forward progress If the network connection between the primary and secondary hosts goes down either the current primary or the current secondary virtual machine will take over and the other virtual machine will die 2 6 NIC Assignments for Logging Tra...

Страница 7: ...traffic depends on the number of non deterministic events and external inputs that need to be recorded at the primary virtual machine Since the majority of this traffic usually consists of incoming network packets and disk reads it is possible to estimate the amount of FT logging network bandwidth in Mbits sec required for the virtual machine using the following formula FT logging bandwidth Averag...

Страница 8: ... previous experiment CPU is 100 percent utilized and thus FT performance is dependent on how well the secondary can keep pace with the primary This workload does some disk reads and writes but generates no network traffic Besides timer interrupt events the FT logging traffic includes the disk reads As seen in theFigure 5 the performance overhead of enabling FT was very small Figure 5 Kernel Compil...

Страница 9: ... Mbits sec 3 4 Netperf Latency Bound Case In this experiment netperf was configured to use the same message and socket size so that outstanding messages could only be sent one at a time Under this setup the TCP IP stack of the sender has to wait for an acknowledgment response from the receiver before sending the next message and thereby any increase in latency results in a corresponding drop in ne...

Страница 10: ...ed to generate random I Os using 200 worker threads This workload saturates available disk bandwidth for the given block size Enabling FT did not impact throughput however at large block sizes disk read operations consumed significant networking bandwidth on the FT logging NIC Figure 8 Filebench Performance 0 1000 2000 3000 5000 4000 6000 7000 8000 9000 FT Disabled FT Enabled Filebench IOPS 2KB re...

Страница 11: ...periment Enabling FT had negligible impact on throughput as well as latency of transactions Figure 9 Oracle 11g Database Performance throughput 0 500 1500 1000 2500 2000 3500 3000 4000 4500 5000 FT Disabled FT Enabled Oracle Swingbench Throughput Operations min FT traffic 11 14 Mbits sec Figure 10 Oracle 11g Database Performance response time 0 100 300 200 400 500 600 700 800 FT Disabled FT Enable...

Страница 12: ...s which read the processor time stamp counter This information has to be recorded at the primary and replayed by the secondary virtual machine As a result the network traffic of this workload includes the time stamp counter information in addition to the disk reads and network packets Figure 11 Microsoft SQL Server 2005 Performance throughput 0 500 1000 1500 2000 FT Disabled FT Enabled Microsoft S...

Страница 13: ...he generally accepted threshold for acceptable latency is 500 ms for the Send Mail operation While FT caused a slight increase the observed SendMail latency was well under 500 ms with and without FT Figure 13 Microsoft Exchange Server 2007 Performance 0 20 10 30 70 60 40 50 80 FT Disabled FT Enabled Microsoft Exchange Server 2007 lower is better Milliseconds Send Mail Average Latency FT traffic 13...

Страница 14: ...is minimal since the round trip latency is usually only on the order of a few hundred microseconds and disk I O operations have latencies in milliseconds When there is sufficient CPU headroom for record replay and sufficient network bandwidth to handle the logging traffic enabling FT has very little impact on throughput Real life workloads exhibit very small generally user imperceptible latency in...

Страница 15: ... of RAM Storage Array System ClariiON CX3 20 FLARE OS 03 26 020 5 011 LUNs RAID 5 LUNs 6 disks RAID 0 LUNS 6 disks Primary and Secondary Hosts System Dell PowerEdge 2950 Processor Intel Xeon CPU E5440 2 83GHz Number of cores 8 Number of sockets 2 L2 cache 6M Memory 8GB Client Machine System HP Proliant DL385 G1 Processor AMD Opteron 275 2 21 Ghz Number of cores 4 Number of sockets 2 Memory 8GB OS ...

Страница 16: ...ration 1 vCPU 1GB RAM LSI Logic Virtual SCSI adapter OS version SLES 10 SP2 x86_64 Kernel version 2 16 16 60 0 21 default Netperf configuration for throughput case Remote and local Message size 8K Remote and local socket size 64K Netperf configuration for latency sensitive case Remote and local message size 8K Remote and local socket size 8K Filebench Virtual machine configuration 1 vCPU 1GB RAM L...

Страница 17: ... 2005 DVD Store Benchmark Virtual machine configuration 1 vCPU 4GB RAM Enhanced VMXNET virtual NIC LSI Logic virtual SCSI adapter OS version Windows Server 2003 R2 Datacenter Edition 64 bit MSSQL version 9 0 1399 Database Size 20 2971MB 200GB split into two vmdk files of 150GB size each Database row count 200 000 000 customers 10 000 000 orders per month 1 000 000 products Dell DVD Store benchmark...

Страница 18: ...the same virtual machine Exchange Database Two 150GB databases each hosting 800 users Loadgen version 08 02 0045 32 bit version 4 25 2008 Loadgen configuration Profile Heavy user profile Users 1600 users Length of Simulation day 8 hrs Test length 4hrs Total Number of tasks 107192 1 24 tasks per second Notes Exchange mailbox database was restored from backup before every run Microsoft Exchange Sear...

Страница 19: ...nternational copyright and intellectual property laws VMware products are covered by one or more patents listed at http www vmware com go patents VMware is a registered trademark or trademark of VMware Inc in the United States and or other jurisdictions All other marks and names mentioned herein may be trademarks of their respective companies VMW_09Q2_WP_vSphere_FaultTolerance_P19_R1 ...

Отзывы: