Refining C/C++ Code
3-47
Optimizing C/C++ Code
Example 3–30. FIR_Type2—Inner Loop Completely Unrolled
void fir2_u(const short input[restrict], const short coefs[restrict], short
out[restrict])
{
int i, j;
int sum;
for (i = 0; i < 40; i++)
{
sum = coefs[0] * input[i + 15];
sum += coefs[1] * input[i + 14];
sum += coefs[2] * input[i + 13];
sum += coefs[3] * input[i + 12];
sum += coefs[4] * input[i + 11];
sum += coefs[5] * input[i + 10];
sum += coefs[6] * input[i + 9];
sum += coefs[7] * input[i + 8];
sum += coefs[8] * input[i + 7];
sum += coefs[9] * input[i + 6];
sum += coefs[10] * input[i + 5];
sum += coefs[11] * input[i + 4];
sum += coefs[12] * input[i + 3];
sum += coefs[13] * input[i + 2];
sum += coefs[14] * input[i + 1];
sum += coefs[15] * input[i + 0];
out[i] = (sum >> 15);
}
}
Now the outer loop is software-pipelined, and the overhead of draining and fill-
ing the software pipeline occurs only once per invocation of the function rather
than for each iteration of the outer loop.
The heuristic the compiler uses to determine if it should unroll the loops needs
to know either of the following pieces of information. Without knowing either
of these the compiler will never unroll a loop.
-
The exact trip count of the loop
-
The trip count of the loop is some multiple of two
The first requirement can be communicated using the MUST_ITERATE prag-
ma. The second requirement can also be passed to the compiler through the
MUST_ITERATE pragma. In section 3.4.3.3,
Communicating Trip-Count In-
formation to the Compiler, it is explained that the MUST_ITERATE pragma can
be used to provide information about loop unrolling. By using the third argu-
ment, you can specify that the trip count is a multiple or power of two.
#pragma MUST_ITERATE (
n,n, 2);