![Intel IXP45X Developer'S Manual Download Page 210](http://html1.mh-extra.com/html/intel/ixp45x/ixp45x_developers-manual_2073092210.webp)
Intel
®
IXP45X and Intel
®
IXP46X Product Line of Network Processors—Intel XScale
®
Processor
Intel
®
IXP45X and Intel
®
IXP46X Product Line of Network Processors
Developer’s Manual
August 2006
210
Order Number: 306262-004US
The variable A[i][k] is completely reused. However, accessing C[j][k] in the j and k
loops can displace A[i][j] from the cache. Using blocking the code becomes:
3.10.4.4.7
Prefetch Unrolling
When iterating through a loop, data transfer latency can be hidden by prefetching
ahead one or more iterations. The solution incurs an unwanted side affect that the final
interactions of a loop loads useless data into the cache, polluting the cache, increasing
bus traffic and possibly evicting valuable temporal data. This problem can be resolved
by prefetch unrolling. For example consider:
Interactions i-1 and i, will prefetch superfluous data. The problem can be avoid by
unrolling the end of the loop.
Unfortunately, prefetch loop unrolling does not work on loops with indeterminate
iterations.
3.10.4.4.8
Pointer Prefetch
Not all looping constructs contain induction variables. However, prefetching techniques
can still be applied. Consider the following linked list traversal example:
The pointer variable p becomes a pseudo induction variable and the data pointed to by
p->next can be pre-fetched to reduce data transfer latency for the next iteration of the
loop. Linked lists should be converted to arrays as much as possible.
for(i=0; i<10000; i++)
for(j1=0; j<100; j++)
for(k1=0; k<100; k++)
for(j2=0; j<100; j++)
for(k2=0; k<100; k++)
{
j = j1 * 100 + j2;
k = k1 * 100 + k2;
C[j][k] += A[i][k] * B[j][i];
}
for(i=0; i<NMAX; i++)
{
prefetch(data[i+2]);
sum += data[i];
}
for(i=0; i<NMAX-2; i++)
{
prefetch(data[i+2]);
sum += data[i];
}
sum += data[NMAX-2];
sum += data[NMAX-1];
while(p) {
do_something(p->data);
p = p->next;
}
while(p) {
prefetch(p->next);
do_something(p->data);
p = p->next;
}