
Load-Execute Instruction Usage
35
22007E/0—November 1999
AMD Athlon™ Processor x86 Code Optimization
Use Load-Execute Floating-Point Instructions with Floating-Point
Operands
When operating on single-precision or double-precision
floating-point data, wherever possible use floating-point
load-execute instructions to increase code density.
Note: This optimization applies only to floating-point instructions
with floating-point operands and not with integer operands,
as described in the next optimization.
This coding style helps in two ways. First, denser code allows
more work to be held in the instruction cache. Second, the
denser code generates fewer internal OPs and, therefore, the
FPU scheduler holds more work, which increases the chances of
extracting parallelism from the code.
Example 1 (Avoid):
FLD
QWORD PTR [TEST1]
FLD
QWORD PTR [TEST2]
FMUL
ST, ST(1)
Example 2 (Preferred):
FLD
QWORD PTR [TEST1]
FMUL
QWORD PTR [TEST2]
Avoid Load-Execute Floating-Point Instructions with Integer Operands
Do not use load-execute floating-point instructions with integer
operands: FIADD, FISUB, FISUBR, FIMUL, FIDIV, FIDIVR,
F I C O M , a n d F I C O M P. R e m e m b e r t h a t f l o a t i n g -p o i n t
i n s tr u c ti o n s c a n h ave i n t e g e r o p e ra n d s w h i l e in t e g e r
instruction cannot have floating-point operands.
Floating-point computations involving integer-memory
operands should use separate FILD and arithmetic instructions.
This optimization has the potential to increase decode
bandwidth and OP density in the FPU scheduler. The floating-
point load-execute instructions with integer operands are
VectorPath and generate two OPs in a cycle, while the discrete
equivalent enables a third DirectPath instruction to be decoded
in the same cycle. In some situations this optimizations can also
reduce execution time if the FILD can be scheduled several
instructions ahead of the arithmetic instruction in order to
cover the FILD latency.
✩
TOP
✩
TOP
Содержание Athlon Processor x86
Страница 1: ...AMD Athlon Processor x86 Code Optimization Guide TM...
Страница 12: ...xii List of Figures AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 16: ...xvi Revision History AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 60: ...44 Code Padding Using Neutral Code Fillers AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 92: ...76 Push Memory Data Carefully AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 122: ...106 Take Advantage of the FSINCOS Instruction AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 156: ...140 AMD Athlon Processor Microarchitecture AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 176: ...160 Write Combining Operations AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 202: ...186 Page Attribute Table PAT AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 252: ...236 VectorPath Instructions AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 256: ...240 Index AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...