IA-32 Intel® Architecture Optimization
4-32
The
PAVGB
instruction operates on packed unsigned bytes and the
PAVGW
instruction operates on packed unsigned words.
Complex Multiply by a Constant
Complex multiplication is an operation which requires four
multiplications and two additions. This is exactly how the
pmaddwd
instruction operates. In order to use this instruction, you need to format
the data into multiple 16-bit values. The real and imaginary components
should be 16-bits each. Consider Example 4-23, which assumes that the
64-bit MMX registers are being used:
•
Let the input data be
Dr
and
Di
where
Dr
is real component of the
data and
Di
is imaginary component of the data.
•
Format the constant complex coefficients in memory as four 16-bit
values [
Cr -Ci Ci Cr
]. Remember to load the values into the MMX
register using a
movq
instruction.
•
The real component of the complex product is
Pr = Dr*Cr - Di*Ci
and the imaginary component of the complex product is
Pi = Dr*Ci + Di*Cr
.
Example 4-23 Complex Multiply by a Constant
; Input:
;
MM0
complex value, Dr, Di
;
MM1
constant complex coefficient in the form
;
[Cr -Ci Ci Cr]
; Output:
;
MM0
two 32-bit dwords containing [Pr Pi]
;
punpckldq
MM0, MM0
; makes [Dr Di Dr Di]
pmaddwd
MM0, MM1
; done, the result is
; [(Dr*Cr-Di*Ci)(Dr*Ci+Di*Cr)]
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...