Refining C/C++ Code
3-44
Several examples in this chapter and in section 7.4.4 show all of the different
ways that the MUST_ITERATE pragma and _nassert intrinsic can be used.
The _nassert intrinsic can convey information about the alignment of pointers
and arrays.
void vecsum(short *restrict a, const short *restrict b,
const short *restrict c)
{
_nassert(((int) a & 0x3) == 0); /* a is word aligned */
_nassert(((int) b & 0x3) == 0); /* b is word aligned */
_nassert(((int) c & 0x7) == 0); /* c is double word
aligned */
. . .
}
See the
TMS320C6000 Optimizing C/C++ Compiler User’s Guide for a com-
plete discussion of the –ms, –o3, and –pm options, the _nassert intrinsic, and
the MUST_ITERATE and PROB_ITERATE pragmas.
3.4.3.4
Loop Unrolling
Another technique that improves performance is unrolling the loop; that is, ex-
panding small loops so that each iteration of the loop appears in your code.
This optimization increases the number of instructions available to execute in
parallel. You can use loop unrolling when the operations in a single iteration
do not use all of the resources of the ’C6000 architecture.
There are three ways loop unrolling can be performed:
1) The compiler may automatically unroll the loop.
2) You can suggest that the compiler unroll the loop using the UNROLL pragma.
3) You can Unroll the C/C++ code yourself
In Example 3–26, the loop produces a new sum[i] every two cycles. Three
memory operations are performed: a load for both in1[i] and in2[i] and a store
for sum[i]. Because only two memory operations can execute per cycle, two
cycles are necessary to perform three memory operations.
Example 3–26. Vector Sum With Three Memory Operations
void vecsum2(short *restrict sum, const short *restrict in1, const short *re-
strict in2, unsigned int N)
{
int i;
for (i = 0; i < N; i++)
sum[i] = in1[i] + in2[i];
}