Group II Optimizations—Secondary Optimizations
9
22007E/0—November 1999
AMD Athlon™ Processor x86 Code Optimization
anywhere, in any type of code (integer, x87, 3DNow!, MMX,
etc.). Use the following formula to determine prefetch distance:
Prefetch Length = 200 (
DS
/
C
)
■
Round up to the nearest cache line.
■
DS is the data stride per loop iteration.
■
C is the number of cycles per loop iteration when hitting in
the L1 cache.
See “Use the 3D Now!™ PR EFETCH and PREFETCHW
Instructions” on page 46 for more details.
Select DirectPath Over VectorPath Instructions
U s e D i re c t Pa t h i n s t r u c t i o n s ra t h e r t h a n Ve c t o r Pa t h
instructions. DirectPath instructions are optimized for decode
and execute efficiently by minimizing the number of operations
per x86 instruction. Three DirectPath instructions can be
decoded in parallel. Using VectorPath instructions will block
DirectPath instructions from decoding simultaneously.
See Appendix G, “DirectPath versus VectorPath Instructions”
on page 219 for a list of DirectPath and VectorPath instructions.
Group II Optimizations—Secondary Optimizations
Load-Execute Instruction Usage
See “Load-Execute Instruction Usage” on page 34 for more
details.
Use Load-Execute Instructions
Wherever possible, use load-execute instructions to increase
code density with the one exception described below. The
split-instruction form of load-execute instructions can be used
to avoid scheduler stalls for longer executing instructions and
to explicitly schedule the load and execute operations.
✩
TOP
✩
TOP
Summary of Contents for Athlon Processor x86
Page 1: ...AMD Athlon Processor x86 Code Optimization Guide TM...
Page 12: ...xii List of Figures AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 16: ...xvi Revision History AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 202: ...186 Page Attribute Table PAT AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 252: ...236 VectorPath Instructions AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 256: ...240 Index AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...