Packed-Data Processing on the ’C64x
8-37
’C64x Programming Considerations
8.2.8
Non-Aligned Memory Accesses
In addition to traditional aligned memory access methods, the ’C64x also pro-
vides intrinsics for non-aligned memory accesses. Aligned memory accesses
are restricted to an alignment boundary that is determined by the amount of
data being accessed. For instance, a 64-bit load must read the data from a
location at a 64-bit boundary. Non-aligned access intrinsics relax this restric-
tion, and can access data at any byte boundary.
There are a number of tradeoffs between aligned and non-aligned access
methods. Table 8–6 lists the differences between both methods.
Table 8–6. Comparison Between Aligned and Non–Aligned Memory Accesses
Aligned
Non–Aligned
Data must be aligned on a boundary
equal to its width.
Data may be aligned on any byte
boundary.
Can read or write bytes, half-words,
words, and double-words.
Can only read or write words and
double-words.
Up to two accesses may be issued per
cycle, for a peak bandwidth of 128 bits/
cycle.
Only one non-aligned access may be
issued per cycle, for a peak bandwidth
of 64 bits/cycle.
Bank conflicts may occur.
No bank conflict possible, because no
other memory access may occur in par-
allel.
Because the ’C64x can only issue one non-aligned memory access per cycle,
programs should focus on using aligned memory accesses whenever pos-
sible. However, certain classes of algorithms are difficult or impossible to fit
into this mold when applying packed-data optimizations. For example, con-
volution-style algorithms such as filters fall in this category, particularly when
the outer loop cannot be unrolled to process multiple outputs at one time.