IA-32 Intel® Architecture Optimization
2-18
The
cmov
and
fcmov
instructions are available on the Pentium II and
subsequent processors, but not on Pentium processors and earlier 32-bit
Intel architecture processors. Be sure to check whether a processor
supports these instructions with the
cpuid
instruction.
Spin-Wait and Idle Loops
The Pentium 4 processor introduces a new
pause
instruction; the
instruction is architecturally a
nop
on all IA-32 implementations. To the
Pentium 4 processor, this instruction acts as a hint that the code
sequence is a spin-wait loop. Without a
pause
instruction in such loops,
the Pentium 4 processor may suffer a severe penalty when exiting the
loop because the processor may detect a possible memory order
violation. Inserting the
pause
instruction significantly reduces the
likelihood of a memory order violation and as a result improves
performance.
In Example 2-4, the code spins until memory location A matches the
value stored in the register
eax
. Such code sequences are common when
protecting a critical section, in producer-consumer sequences, for
barriers, or other synchronization.
Example 2-3
Eliminating Branch with CMOV Instruction
test ecx, ecx
jne 1h
mov eax, ebx
1h:
; To optimize code, combine jne and mov into one cmovcc
; instruction that checks the equal flag
test ecx, ecx
; test the flags
cmoveq eax, ebx
; if the equal flag is set, move
; ebx to eax - the lh: tag no longer
;
needed
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...