Memory Banks
6-122
6.12.2 Unrolled FIR Filter C Code
The main limitation in solving the problem in Figure 6–24 is in scheduling a 2-
cycle loop, which means that no value can be live more than two cycles. In-
creasing the iteration interval to 3 decreases performance. A better solution
is to unroll the inner loop one more time and produce a 4-cycle loop.
Example 6–66 shows the FIR filter C code after unrolling the inner loop one
more time. This solution adds to the flexibility of scheduling and allows you to
write FIR filter code that never has memory hits, regardless of array alignment
and memory block.
Example 6–66. FIR Filter C Code (Unrolled)
void fir(short x[], short h[], short y[])
{
int i, j, sum0, sum1;
short x0,x1,x2,x3,h0,h1,h2,h3;
for (j = 0; j < 100; j+=2) {
sum0 = 0;
sum1 = 0;
x0 = x[j];
for (i = 0; i < 32; i+=4){
x1 = x[j+i+1];
h0 = h[i];
sum0 += x0 * h0;
sum1 += x1 * h0;
x2 = x[j+i+2];
h1 = h[i+1];
sum0 += x1 * h1;
sum1 += x2 * h1;
x3 = x[j+i+3];
h2 = h[i+2];
sum0 += x2 * h2;
sum1 += x3 * h2;
x0 = x[j+i+4];
h3 = h[i+3];
sum0 += x3 * h3;
sum1 += x0 * h3;
}
y[j] = sum0 >> 15;
y[j+1] = sum1 >> 15;
}
}