![Intel IXP45X Скачать руководство пользователя страница 217](http://html1.mh-extra.com/html/intel/ixp45x/ixp45x_developers-manual_2073092217.webp)
Intel
®
IXP45X and Intel
®
IXP46X Product Line of Network Processors
August 2006
Developer’s Manual
Order Number: 306262-004US
217
Intel XScale
®
Processor—Intel
®
IXP45X and Intel
®
IXP46X Product Line of Network Processors
All data processing instructions incur a two cycle issue penalty and a two-cycle result
penalty when the shifter operand is a shift/rotate by a register or shifter operand is
RRX. Since the next instruction would always incur a 2 cycle issue penalty, there is no
way to avoid such a stall except by re-writing the assembler instruction. Consider the
following segment of code:
The subtract instruction would incur a one-cycle stall due to the issue latency of the
add instruction as the shifter operand is shift by a register. The issue latency can be
avoided by changing the code as follows:
3.10.5.3
Scheduling Multiply Instructions
Multiply instructions can cause pipeline stalls due to either resource conflicts or result
latencies. The following code segment would incur a stall of zero to three cycles
depending on the values in registers r1, r2, r4 and r5 due to resource conflicts.
The following code segment would incur a stall of one to three cycles, depending on the
values in registers r1 and r2 due to result latency.
Note that a multiply instruction that sets the condition codes blocks the whole pipeline.
A four-cycle multiply operation that sets the condition codes behaves the same as a 4
cycle issue operation. Consider the following code segment:
The add operation above would stall for three cycles if the multiply takes four cycles to
complete. It is better to replace the code segment above with the following sequence:
Please refer to
“Instruction Latencies” on page 182
to get the instruction latencies for
various multiply instructions. The multiply instructions should be scheduled taking into
consideration these instruction latencies.
mov r3, #10
mul r4, r2, r3
add r5, r6, r2, LSL r3
sub r7, r8, r2
mov r3, #10
mul r4, r2, r3
add r5, r6, r2, LSL #10
sub r7, r8, r2
mul r0, r1, r2
mul r3, r4, r5
mul r0, r1, r2
mov r4, r0
muls r0, r1, r2
add r3, r3, #1
sub r4, r4, #1
sub r5, r5, #1
mul r0, r1, r2
add r3, r3, #1
sub r4, r4, #1
sub r5, r5, #1
cmp r0, #0