Refining C/C++ Code
3-25
Optimizing C/C++ Code
Example 3–9. Vector Sum With Non–aligned Word Accesses to Memory
void vecsum4a(short *restrict sum, const short *restrict in1,
const short *restrict in2, unsigned int N)
{
int i;
#pragma MUST_ITERATE (10)
for (i = 0; i < N; i += 2)
_mem4((void *)&sum[i]) = _add2(_mem4((void *)&in1[i]),
_mem4((void *)&in2[i]));
}
Another consideration is that the loop must now run for an even number of it-
erations. You can ensure that this happens by padding the short arrays so that
the loop always operates on an even number of elements.