Intel® PXA27x Processor Family
Optimization Guide
5-9
High Level Language Optimization
{
prefetch(A[i][j+1]);
sum += A[i][j];
}
5.1.5
Loop Fusion
Loop fusion is a process of combining multiple loops, which reuse the same data, into one loop.
The advantage of this is that the reused data is immediately accessible from the data cache. Refer to
this example:
for(i=0; i<NMAX; i++)
{
prefetch(A[i+1], c[i+1], c[i+1]);
A[i] = b[i] + c[i];
}
for(i=0; i<NMAX; i++)
{
prefetch(D[i+1], c[i+1], A[i+1]);
D[i] = A[i] + c[i];
}
The second loop reuses the data elements A[i] and c[i]. Fusing the loops together produces:
for(i=0; i<NMAX; i++)
{
prefetch(D[i+1], A[i+1], c[i+1], b[i+1]);
ai = b[i] + c[i];
A[i] = ai;
D[i] = ai + c[i];
}
In some instances, loop fusion can actually cause performance degradation. In general, loop fusion
should only be used when the data operated on in each loop is the same and when all of the
contents within the fused loop will fit entirely in the instruction cache.
5.1.6
Loop Unrolling
Most compilers unroll fixed length loops when compiled with speed optimizations.
Summary of Contents for PXA270
Page 1: ...Order Number 280004 001 Intel PXA27x Processor Family Optimization Guide April 2004...
Page 10: ...x Intel PXA27x Processor Family Optimization Guide Contents...
Page 20: ...1 10 Intel PXA27x Processor Family Optimization Guide Introduction...
Page 30: ...2 10 Intel PXA27x Processor Family Optimization Guide Microarchitecture Overview...
Page 48: ...3 18 Intel PXA27x Processor Family Optimization Guide System Level Optimization...
Page 114: ...5 16 Intel PXA27x Processor Family Optimization Guide High Level Language Optimization...
Page 122: ...6 8 Intel PXA27x Processor Family Optimization Guide Power Optimization...
Page 143: ...Intel PXA27x Processor Family Optimization Guide Index 5 Index...
Page 144: ......