Intel® PXA27x Processor Family
Optimization Guide
5-9
High Level Language Optimization
{
prefetch(A[i][j+1]);
sum += A[i][j];
}
5.1.5
Loop Fusion
Loop fusion is a process of combining multiple loops, which reuse the same data, into one loop.
The advantage of this is that the reused data is immediately accessible from the data cache. Refer to
this example:
for(i=0; i<NMAX; i++)
{
prefetch(A[i+1], c[i+1], c[i+1]);
A[i] = b[i] + c[i];
}
for(i=0; i<NMAX; i++)
{
prefetch(D[i+1], c[i+1], A[i+1]);
D[i] = A[i] + c[i];
}
The second loop reuses the data elements A[i] and c[i]. Fusing the loops together produces:
for(i=0; i<NMAX; i++)
{
prefetch(D[i+1], A[i+1], c[i+1], b[i+1]);
ai = b[i] + c[i];
A[i] = ai;
D[i] = ai + c[i];
}
In some instances, loop fusion can actually cause performance degradation. In general, loop fusion
should only be used when the data operated on in each loop is the same and when all of the
contents within the fused loop will fit entirely in the instruction cache.
5.1.6
Loop Unrolling
Most compilers unroll fixed length loops when compiled with speed optimizations.
Содержание PXA270
Страница 1: ...Order Number 280004 001 Intel PXA27x Processor Family Optimization Guide April 2004...
Страница 10: ...x Intel PXA27x Processor Family Optimization Guide Contents...
Страница 20: ...1 10 Intel PXA27x Processor Family Optimization Guide Introduction...
Страница 30: ...2 10 Intel PXA27x Processor Family Optimization Guide Microarchitecture Overview...
Страница 48: ...3 18 Intel PXA27x Processor Family Optimization Guide System Level Optimization...
Страница 114: ...5 16 Intel PXA27x Processor Family Optimization Guide High Level Language Optimization...
Страница 122: ...6 8 Intel PXA27x Processor Family Optimization Guide Power Optimization...
Страница 143: ...Intel PXA27x Processor Family Optimization Guide Index 5 Index...
Страница 144: ......