
Fast Conversion of Signed Words to Floating-Point
113
22007E/0—November 1999
AMD Athlon™ Processor x86 Code Optimization
Fast Conversion of Signed Words to Floating-Point
In many applications there is a need to quickly convert data
consisting of packed 16-bit signed integers into floating-point
numbers. The following two examples show how this can be
accomplished efficiently on AMD processors.
The first example shows how to do the conversion on a processor
t ha t s u p p o r ts A M D ’ s 3 D N ow ! ex te n s i o n s , s u ch a s t h e
AMD A thlon proces sor. It demo nstrates the increased
efficiency from using the PI2FW instruction. Use of this
instruction should only be for AMD Athlon processor specific
code. See the AMD Extensions to the 3DNow!™ and MMX™
Instruction Set Manual, order #22466 for more information on
this instruction.
The second example demonstrates how to accomplish the same
task in blended code that achieves good performance on the
AMD Athlon processor as well as on the AMD-K6 family
processors that support 3DNow! technology.
Example 1 (AMD Athlon specific code using 3DNow! DSP extension):
MOVD
MM0, [packed_sword]
;0 0 | b a
PUNPCKLWD MM0, MM0
;b b | a a
PI2FW
MM0, MM0
;xb=float(b)
| xa=float(a)
MOVQ
[packed_float], MM0
;store xb | xa
Example 2 (AMD-K6 Family and AMD Athlon processor blended code):
MOVD MM1, [packed_sword] ;0 0 | b a
PXOR MM0, MM0 ;0 0 | 0 0
PUNPCKLWD MM0, MM1 ;b 0 | a 0
PSRAD MM0, 16 ;sign extend: b | a
PI2FD MM0, MM0 ;xb=float(b) | xa=float(a)
MOVQ [packed_float], MM0 ;store xb | xa
Use MMX™ PXOR to Negate 3DNow!™ Data
For both the AMD Athlon and AMD-K6 processors, it is
recommended that code use the MMX PXOR instruction to
change the sign bit of 3DNow! operations instead of the 3DNow!
PFMUL instruction. On the AMD Athlon processor, using
PXOR allows for more parallelism, as it can execute in either
the FADD or FMUL pipes. PXOR has an execution latency of
two, but because it is a MMX instruction, there is an initial one
Содержание Athlon Processor x86
Страница 1: ...AMD Athlon Processor x86 Code Optimization Guide TM...
Страница 12: ...xii List of Figures AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 16: ...xvi Revision History AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 60: ...44 Code Padding Using Neutral Code Fillers AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 92: ...76 Push Memory Data Carefully AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 122: ...106 Take Advantage of the FSINCOS Instruction AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 156: ...140 AMD Athlon Processor Microarchitecture AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 176: ...160 Write Combining Operations AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 202: ...186 Page Attribute Table PAT AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 252: ...236 VectorPath Instructions AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Страница 256: ...240 Index AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...