Cycle Timings and Interlock Behavior
ARM DDI 0363E
Copyright © 2009 ARM Limited. All rights reserved.
14-11
ID013010
Non-Confidential, Unrestricted Access
14.6
Sum of Absolute Differences (SAD)
Table 14-7 shows
SAD
instructions and gives their cycle timing behavior.
14.6.1
Example interlocks
Table 14-8 shows interlock examples using
USAD8
and
USADA8
instructions.
Table 14-7 Sum of absolute differences instruction timing behavior
Instructions
Cycles
Early Reg
Result latency
USAD8
1
<Rn>, <Rm>
2
a
a. Result latency is one fewer if the destination is the
accumulate for a subsequent
USADA8
.
USADA8
1
<Rn>, <Rm>
2
a
Table 14-8 Example interlocks
Instruction sequence
Behavior
USAD8 R1,R2,R3
ADD R5,R6,R1
Takes three cycles because
USAD8
has a Result Latency of two, and the
ADD
requires
the result of the
USAD8
instruction.
USAD8 R1,R2,R3
MOV R9,R9
ADD R5,R6,R1
Takes three cycles. The
MOV
instruction is scheduled during the Result Latency of
the USAD8 instruction.
USAD8 R1,R2,R3
USADA8 R1,R4,R5,R1
Takes two cycles. The Result Latency is one less because the result is used as the
accumulate for a subsequent
USADA8
instruction.