Intel® PXA27x Processor Family
Optimization Guide
5-11
High Level Language Optimization
nItersPerBlock;
for (i=0; i<nTotalBlockIters; i+=nItersPerBlock)
{
// unrolling nItersPerBlock times
f(i);
f(i+1);
f(i+2);
f(i+3);}
// any remaining iterations must now be completed
for (; i<nTotalIterations; ++i)
{
f(i);}
}
Carefully choosing a value for nItersPerBlock based on the task (choosing 8, 16, etc., when large
values of nTotalIterations are predicted) increases the benefit of this technique. Again, performance
may potentially decline if the instructions within the unrolled block do not fit in the instruction
cache. Ensure that all inline functions, inline procedures, and macros used within the block fit
within the instruction cache.
5.1.7
Loop Conditionals
Another simple optimization increases the performance of tight loops. When possible, using a
decrementing counter that approaches zero can provide a significant performance increase.
For example, here is a typical for() loop.
for (i=0; i<1000; ++i)
{
p1();}
This code provides the same behavior without as much loop overhead.
for (i=1000; i>0; --i)
{
p1();}
Summary of Contents for PXA270
Page 1: ...Order Number 280004 001 Intel PXA27x Processor Family Optimization Guide April 2004...
Page 10: ...x Intel PXA27x Processor Family Optimization Guide Contents...
Page 20: ...1 10 Intel PXA27x Processor Family Optimization Guide Introduction...
Page 30: ...2 10 Intel PXA27x Processor Family Optimization Guide Microarchitecture Overview...
Page 48: ...3 18 Intel PXA27x Processor Family Optimization Guide System Level Optimization...
Page 114: ...5 16 Intel PXA27x Processor Family Optimization Guide High Level Language Optimization...
Page 122: ...6 8 Intel PXA27x Processor Family Optimization Guide Power Optimization...
Page 143: ...Intel PXA27x Processor Family Optimization Guide Index 5 Index...
Page 144: ......