Packed-Data Processing on the ’C64x
8-38
8.2.8.1
Using Non-Aligned Memory Access Intrinsics
Non-aligned memory accesses are generated using the _mem4() and
_memd8() intrinsics. These intrinsics generate a non-aligned reference which
may be read or written to, much like an array reference. Example 8–14 below
illustrates reading and writing via these intrinsics.
Example 8–14. Non–aligned Memory Access With _mem4 and _memd8
char a[1000]; /* Sample array */
double d;
/* Store four bytes at a[9] through a[12] */
_mem4((void*) &a[9]) = 0x12345678;
/* Load eight bytes from a[115] through a[122] */
d = _memd8((void*) &a[115]);
It is easy to modify code to use non-aligned accesses. Example 8–15 below
shows the Vector Sum from Example 8–6 rewritten to use non-aligned
memory accesses. As with ordinary array references, the compiler will opti-
mize away the redundant references.
Example 8–15. Vector Sum Modified to use Non–Aligned Memory Accesses
void vec_sum(const short *restrict a, const short *restrict b,
short *restrict c, int len)
{
int i;
unsigned a3_a2, a1_a0;
unsigned b3_b2, b1_b0;
unsigned c3_c2, c1_c0;
for (i = 0; i < len; i += 4)
{
a3_a2 = _hi(_memd8((void*) &a[i]));
a1_a0 = _lo(_memd8((void*) &a[i]));
b3_b2 = _hi(_memd8((void*) &b[i]));
b1_b0 = _lo(_memd8((void*) &b[i]));
c3_c2 = _add2(b3_b2, a3_a2);
c1_c0 = _add2(b1_b0, a1_a0);
_memd8((void*) &c[i]) = _itod(c3_c2, c1_c0);
}
}