Packed-Data Processing on the ’C64x
8-13
’C64x Programming Considerations
Figure 8–7. Graphical Representation of (_shlmb, _shrmb, and _swap4)
a_3
a_2
a_1
a_0
a
b_1
b_3
b_2
b
b_0
a_2
a_1
a_0
b_3
c = _shlmb(b, a)
c
c
b_1
b_3
b_2
b
a_1
a_3
b_0
a_2
a
a_0
a_1
a_2
a_3
b_0
c = _shrmb(b, a)
a_1
a_3
a_2
a_0
a_0
a_2
a_3
a_1
b
a
b = _swap4(a)
8.2.5
Optimizing for Packed Data Processing
The ’C64x supports two basic forms of packed-data optimization, namely vec-
torization and macro operations.
Vectorization works by applying the exact same simple operations to several
elements of data simultaneously. Kernels such as vector sum and vector multi-
ply, shown in Example 8–1 and Example 8–2, exemplify this sort of computa-
tion.