Outer Loop Conditionally Executed With Inner Loop
6-136
6.14 Outer Loop Conditionally Executed With Inner Loop
Software pipelining the outer loop improved the outer loop overhead in the
previous example from 16 cycles to 8 cycles. Executing the outer loop condi-
tionally and in parallel with the inner loop eliminates the overhead entirely.
6.14.1 Unrolled FIR Filter C Code
Example 6–72 shows the same unrolled FIR filter C code that used in the
previous example.
Example 6–72. Unrolled FIR Filter C Code
void fir(short x[], short h[], short y[])
{
int i, j, sum0, sum1;
short x0,x1,x2,x3,h0,h1,h2,h3;
for (j = 0; j < 100; j+=2) {
sum0 = 0;
sum1 = 0;
x0 = x[j];
for (i = 0; i < 32; i+=4){
x1 = x[j+i+1];
h0 = h[i];
sum0 += x0 * h0;
sum1 += x1 * h0;
x2 = x[j+i+2];
h1 = h[i+1];
sum0 += x1 * h1;
sum1 += x2 * h1;
x3 = x[j+i+3];
h2 = h[i+2];
sum0 += x2 * h2;
sum1 += x3 * h2;
x0 = x[j+i+4];
h3 = h[i+3];
sum0 += x3 * h3;
sum1 += x0 * h3;
}
y[j] = sum0 >> 15;
y[j+1] = sum1 >> 15;
}
}