
38
Replace Certain SHLD Instructions with Alternative
AMD Athlon™ Processor x86 Code Optimization
22007E/0—November 1999
Replace Certain SHLD Instructions with Alternative Code
Certain instances of the SHLD instruction can be replaced by
alternative code using SHR and LEA. The alternative code has
lower latency and requires less execution resources. SHR and
LEA (32-bit version) are DirectPath instructions, while SHLD is
a VectorPath instruction. SHR and LEA preserves decode
bandwidth as it potentially enables the decoding of a third
DirectPath instruction.
Example 1
(Avoid):
SHLD REG1, REG2, 1
(Preferred):
SHR REG2, 31
LEA REG1, [REG1*2 + REG2]
Example 2
(Avoid):
SHLD REG1, REG2, 2
(Preferred):
SHR REG2, 30
LEA REG1, [REG1*4 + REG2]
Example 3
(Avoid):
SHLD REG1, REG2, 3
(Preferred):
SHR REG2, 29
LEA REG1, [REG1*8 + REG2]
Use 8-Bit Sign-Extended Immediates
Using 8-bit sign-extended immediates improves code density
with no negative effects on the AMD Athlon processor. For
example, ADD BX, –5 should be encoded “83 C3 FB” and not
“81 C3 FF FB”.
Summary of Contents for Athlon Processor x86
Page 1: ...AMD Athlon Processor x86 Code Optimization Guide TM...
Page 12: ...xii List of Figures AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 16: ...xvi Revision History AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 202: ...186 Page Attribute Table PAT AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 252: ...236 VectorPath Instructions AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 256: ...240 Index AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...