Packed-Data Processing on the ’C64x
8-35
’C64x Programming Considerations
The solution is to reorder the halfwords from one of the inputs, so that the imag-
inary component is in the upper halfword and the real component is in the lower
halfword. This is accomplished by using the _packlh2 intrinsic to reorder the
halves of the word. Once the half–words are reordered on one of the inputs,
the _dotp intrinsic provides the appropriate combination of multiplies with an
add to provide the imaginary component of the output.
Figure 8–20. _packlh2 and _dotp2 Working Together.
Real
Imaginary
a
b
*
*
a_imaginary * b_real
a_real * b_imaginary
add
a_imag * a_real * b_imag
c
c = _dotp2 (b, _packl2(a, a))
32 bit
a’ = _packlh2(a, a);
a’
Imaginary
Real
Imaginary
Real
Once both the real and imaginary components of the result are calculated, it
is necessary to convert the 32-bit results to 16-bit results and store them. In
the original code, the 32-bit results were shifted right by 16 to convert them to
16-bit results. These results were then packed together with _pack2 for stor-
ing. Our final optimization replaces this shift and pack with a single _packh2.
Example 8–13 shows the result of these optimizations.