Cycle Timings and Interlock Behavior
ARM DDI 0363G
Copyright © 2006-2011 ARM Limited. All rights reserved.
C-11
ID073015
Non-Confidential
C.6
Sum of Absolute Differences (SAD)
shows
SAD
instructions and gives their cycle timing behavior.
C.6.1
Example interlocks
shows interlock examples using
USAD8
and
USADA8
instructions.
Table C-7 Sum of absolute differences instruction timing behavior
Instructions
Cycles
Early Reg
Result latency
USAD8
1
<Rn>, <Rm>
2
a
a. Result latency is one fewer if the destination is the accumulate
for a subsequent
USADA8
.
USADA8
1
<Rn>, <Rm>
2
a
Table C-8 Example interlocks
Instruction sequence
Behavior
USAD8 R1,R2,R3
ADD R5,R6,R1
Takes three cycles because
USAD8
has a Result Latency of two, and the
ADD
requires
the result of the
USAD8
instruction.
USAD8 R1,R2,R3
MOV R9,R9
ADD R5,R6,R1
Takes three cycles. The
MOV
instruction is scheduled during the Result Latency of
the USAD8 instruction.
USAD8 R1,R2,R3
USADA8 R1,R4,R5,R1
Takes two cycles. The Result Latency is one less because the result is used as the
accumulate for a subsequent
USADA8
instruction.