General Optimization Guidelines
2
2-17
See Example 2-2. The optimized code sets
ebx
to zero, then compares A
and B. If A is greater than or equal to B,
ebx
is set to one. Then
ebx
is
decreased and “
and
-ed” with the difference of the constant values. This
sets
ebx
to either zero or the difference of the values. By adding
CONST2
back to
ebx
, the correct value is written to
ebx
. When
CONST2
is equal to
zero, the last instruction can be deleted.
Another way to remove branches on Pentium II and subsequent
processors is to use the
cmov
and
fcmov
instructions. Example 2-3
shows changing a
test
and branch instruction sequence using
cmov
and
eliminating a branch. If the
test
sets the equal flag, the value in
ebx
will be moved to
eax
. This branch is data-dependent, and is
representative of an unpredictable branch.
Example 2-1
Assembly Code with an Unpredictable Branch
cmp A, B
; condition
jge L30
; conditional branch
mov ebx, CONST1 ; ebx holds X
jmp L31
; unconditional branch
L30:
mov ebx, CONST2
L31:
Example 2-2
Code Optimization to Eliminate Branches
xor ebx, ebx ; clear ebx (X in the C code)
cmp A, B
setge bl
; When ebx = 0 or 1
; OR the complement condition
sub ebx, 1
; ebx=11...11 or 00...00
and ebx, CONST3 ; CONST3 = CONST1-CONST2
add ebx, CONST2 ; ebx=CONST1 or CONST2
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...