IA-32 Intel® Architecture Optimization
8-4
Can be replaced with:
movsx r8, r9w
;If bits 63:8 do not need to be
;preserved.
movsx r8, r10b
;If bits 63:8 do not need to
;be preserved.
In the above example, the moves to r8w and r8b both require a merge to
preserve the rest of the bits in the register. There is an implicit real
dependency on r8 between the 'mov r8w, r9w' and 'mov r8b, r10b'.
Using movsx breaks the real dependency and leaves only the output
dependency, which the processor can eliminate through renaming.
Assembly/Compiler Coding rule
Sign extend to 64-bits instead of sign extending to 32 bits, even when the
destination will be used as a 32-bit value.
Alternate Coding Rules for 64-Bit Mode
Use 64-Bit Registers Instead of Two 32-Bit Registers for
64-Bit Arithmetic
Legacy 32-bit mode offers the ability to support extended precision
integer arithmetic (such as 64-bit arithmetic). However, 64-bit mode
offers native support for 64-bit arithmetic. When 64-bit integers are
desired, use the 64-bit forms of arithmetic instructions.
In 32-bit legacy mode, getting a 64-bit result from a 32-bit by 32-bit
integer multiply requires three registers; the result is stored in 32-bit
chunks in the EDX:EAX pair. When the instruction is available in 64-bit
mode, using the 32-bit version of the instruction is not the optimal
implementation if a 64-bit result is desired. Use the extended registers.
For example, the following code sequence loads the 32-bit values
sign-extended into the 64-bit registers and performs a multiply:
MOVSX RAX, DWORD PTR[X]
MOVSX RCX, DWORD PTR[Y]
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...