Cycle Timings and Interlock Behavior
ARM DDI 0301H
Copyright © 2004-2009 ARM Limited. All rights reserved.
16-11
ID012310
Non-Confidential, Unrestricted Access
16.6
ARMv6 Sum of Absolute Differences (SAD)
Table 16-8 lists ARMv6 SAD instructions and gives their cycle timing behavior.
16.6.1
Example interlocks
Table 16-9 lists interlock examples using USAD8 and USAD8 instructions.
Table 16-8 ARMv6 sum of absolute differences instruction timing behavior
Instructions
Cycle
s
Early Reg
Result latency
USAD8
1
<Rm>, <Rs>
3
a
a. Result latency is one less If the destination is the
accumulate for a subsequent USADA8.
USADA8
1
<Rm>, <Rs>
3
Table 16-9 Example interlocks
Instruction sequence
Behavior
USAD8 R1,R2,R3
ADD
R5,R6,R1
Takes four cycles because USAD8 has a Result latency of three, and the ADD requires the
result of the USAD8 instruction.
USAD8 R1,R2,R3
MOV
R9,R9
MOV
R9,R9
ADD
R5,R6,R1
Takes four cycles. The MOV instructions are scheduled during the Result latency of the
USAD8 instruction.
USAD8
R1,R2,R3
USADA8 R1,R4,R5,R1
Takes three cycles. The Result latency is one less because the result is used as the
accumulate for a subsequent USADA8 instruction.