100
Minimize Floating-Point-to-Integer Conversions
AMD Athlon™ Processor x86 Code Optimization
22007E/0—November 1999
Minimize Floating-Point-to-Integer Conversions
C++, C, and Fortran define floating-point-to-integer conversions
as truncating. This creates a problem because the active
rounding mode in an application is typically round-to-nearest-
even. The classical way to do a double-to-int conversion
therefore works as follows:
Example 1 (Fast):
SUB [I], EDX ;trunc(X)=rndint(X)-correction
FLD QWORD PTR [X] ;load double to be converted
FSTCW [SAVE_CW] ;save current FPU control word
MOVZX EAX, WORD PTR[SAVE_CW];retrieve control word
OR EAX, 0C00h
;rounding control field = truncate
MOV WORD PTR [NEW_CW], AX ;new FPU control word
FLDCW [NEW_CW] ;load new FPU control word
FISTP DWORD PTR [I] ;do double->int conversion
FLDCW [SAVE_CW] ;restore original control word
The AMD Athlon processor contains special acceleration
hardware to execute such code as quickly as possible. In most
situations, the above code is therefore the fastest way to
perform floating-point-to-integer conversion and the conversion
is compliant both with programming language standards and
the IEEE-754 standard.
According to the recommendations for inlining (see “Always
Inline Functions with Fewer than 25 Machine Instructions” on
page 72), the above code should not be put into a separate
subroutine (e.g., ftol). It should rather be inlined into the main
code.
In some codes, floating-point numbers are converted to an
integer and the result is immediately converted back to
floating-point. In such cases, the FRNDINT instruction should
be used for maximum performance instead of FISTP in the code
above. FRNDINT delivers the integral result directly to an FPU
register in floating-point form, which is faster than first using
FISTP to store the integer result and then converting it back to
floating-point with FILD.
If there are multiple, consecutive floating-point-to-integer
c o nve rs i on s , t h e c o s t o f F L D C W o p e ra t io n s s h o u ld b e
minimized by saving the current FPU control word, forcing the
Summary of Contents for Athlon Processor x86
Page 1: ...AMD Athlon Processor x86 Code Optimization Guide TM...
Page 12: ...xii List of Figures AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 16: ...xvi Revision History AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 202: ...186 Page Attribute Table PAT AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 252: ...236 VectorPath Instructions AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 256: ...240 Index AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...