![Intel IXP45X Скачать руководство пользователя страница 213](http://html1.mh-extra.com/html/intel/ixp45x/ixp45x_developers-manual_2073092213.webp)
Intel
®
IXP45X and Intel
®
IXP46X Product Line of Network Processors
August 2006
Developer’s Manual
Order Number: 306262-004US
213
Intel XScale
®
Processor—Intel
®
IXP45X and Intel
®
IXP46X Product Line of Network Processors
to avoid this stall. Consider the following example:
In the code shown above, the ADD instruction following the LDR would stall for two
cycles because it uses the result of the load. The code can be rearranged as follows to
prevent the stalls:
Note that this rearrangement may not be always possible. Consider the following
example:
In the example above, the LDR instruction cannot be moved before the ADDNE or the
SUBEQ instructions because the LDR instruction depends on the result of these
instructions. Rewrite the above code to make it run faster at the expense of increasing
code size:
The optimized code takes six cycles to execute compared to the seven cycles taken by
the unoptimized version.
The result latency for an LDR instruction is significantly higher if the data being loaded
is not in the data cache. To minimize the number of pipeline stalls in such a situation
the LDR instruction should be moved as far away as possible from the instruction that
uses result of the load. Note that this may at times cause certain register values to be
add r1, r2, r3
ldr r0, [r5]
add r6, r0, r1
sub r8, r2, r3
mul r9, r2, r3
ldr r0, [r5]
add r1, r2, r3
sub r8, r2, r3
add r6, r0, r1
mul r9, r2, r3
cmp r1, #0
addne r4, r5, #4
subeq r4, r5, #4
ldr r0, [r4]
cmp r0, #10
cmp r1, #0
ldrne r0, [r5, #4]
ldreq r0, [r5, #-4]
addne r4, r5, #4
subeq r4, r5, #4
cmp r0, #10