Cycle Timings and Interlock Behavior
ARM DDI 0363E
Copyright © 2009 ARM Limited. All rights reserved.
14-36
ID013010
Non-Confidential, Unrestricted Access
Case F2_st
b
VSTR.F32
n
As for Case B1.
Any single-precision CDP
i
, excluding
multiply-accumulate instructions
o
.
32-bit transfers to and from the floating-point
register file
l
.
Case F2D
b
VLDR.F64
n
As for Case B1.
Case F3
b
32-bit transfers to and from the floating-point register
file
l
"VMOV.F32 <Sd>, <Sd>, <Sm>"
,
VABS.F32
, and
VNEG.F32
.
As for Case F2_st.
Case F4
b
Any instruction that does not set flags, other than
load/store multiple/double, non-VFP coprocessor
operations, multi-cycle multiply instructions
p
, double
precision floating point
CDP
instructions,
VCVT.F64.F32
, or
a miscellaneous processor control instruction
a
Any single-precision CDP
i
, excluding
"VMOV.F32 <Sd>, #<imm>"
,
VNEG.F32
,
VABS.F32
,
VCVT.F64.F32
,
VDIV.F32
, and
VSQRT.F32
.
32-bit transfers to and from the floating-point
register file
l
.
Case F6
b
VMRS r15, FPSCR
As for Case A.
a. These are processor state updating instructions, synchronization instructions,
SVC
,
BKPT
, prefetch abort and Undefined
instructions.
b. This case can only occur if floating-point functionality has been configured for the Cortex-R4F processor, see
Configurable
options
on page 1-13.
c. You can substitute
LDR
with
LDRB
,
LDRH
,
LDRSB
, or
LDRSH
. You can also substitute
STR
with
STRB
or
STRH
.
d. Data processing instructions are
ADC
,
ADD
,
ADDW
,
AND
,
ASR
,
BIC
,
CLZ
,
CMN
,
CMP
,
EOR
,
LSL
,
LSR
,
MOV
,
MOVT
,
MOVW
,
MVN
,
ORN
,
ORR
,
ROR
,
RRX
,
RSB
,
SBC
,
SUB
,
SUBW
,
TEQ
, and
TST
.
e. Bitfield, saturate, and bit-packing instructions are
BFC
,
BFI
,
PKHBT
,
PKHTB
,
QADD
,
QDADD
,
QDSUB
,
QSUB
,
SBFX
,
SSAT
,
SSAT16
,
UBFX
,
USAT
,
and
USAT16
.
f. Signed or unsigned extend instructions are
SXTAB
,
SXTAB16
,
SXTAH
,
SXTB
,
SXTB16
,
SXTH
,
UXTAB
,
UXTAB16
,
UXTAH
,
UXTB
,
UXTB16
, and
UXTH
.
g.
SIMD
add and subtract instructions are
QADD16
,
QADD8
,
QASX
,
SQUB16
,
QSUB8
,
QSAX
,
SADD16
,
SADD8
,
SASX
,
SHADD16
,
SHADD8
,
SHASX
,
SHSUB16
,
SHSUB8
,
SHSAX
,
SSUB16
,
SSUB8
,
SSAX
,
UADD16
,
UADD8
,
UASX
,
UHADD16
,
UHADD8
,
UHASX
,
UHSUB16
,
UHSUB8
,
UHSAX
,
UQADD16
,
UQADD8
,
UQASX
,
UQSUB16
,
UQSUB8
,
UQSAX
,
USUB16
,
USUB8
, and
USAX
.
h. Other miscellaneous instructions are
RBIT
,
REV
,
REV16
,
REVSH
, and
SEL
.
i. Single-precision
CDPs
are
VABS.F32
,
VNEG.F32
,
"VMOV.F32 <Sd>, #<imm>"
,
VMLA.F32
,
VMLS.F32
,
VNMLS.F32
,
VNMLA.F32
,
VMUL.F32
,
VNMUL.F32
,
VADD.F32
,
VSUB.F32
,
VDIV.F32
,
VSQRT.F32
,
VCMP.F32
,
VCMPE.F32
,
VCVT.F64.F32
,
VCVT.F32.U32
,
VCVT.F32.S32
,
VCVT.F32.U16
,
VCVT.F32.S16
,
VCVTR.U32.F32
,
VCVT.U32.F32
,
VCVTR.S32.F32
,
VCVT.S32.F32
,
VCVT.U16.F32
, and
VCVT.S16.F32
.
j. Must not be flag-setting.
k. Immediate value must not require a shift.
l. 32-bit transfers to or from the floating point register file include single or half-double floating point register transfers, including
"VMOV <Sn>, <Rt>"
,
"VMOV <Dn[x]>, <Rt>"
,
"VMOV <Rt>, <Dn[x]>"
, and
"VMOV <Rt>, <Sn>"
, but excluding
VMRS
and
VMSR
.
m. When the first instruction is a floating point multiply-accumulate, and the second instruction is a 32-bit transfer to the
floating-point register file, case F1 can only occur if the two instructions have different destination registers.
n. Any addressing modes.
o. Single-precision floating-point multiply-accumulate instructions are
VMLA.F32
,
VMLS.F32
,
VNMLS.F32
, and
VNMLA.F32
.
p. Multi-cycle multiply instructions are
SMMUL
,
SMMLA
,
SMMLS
,
MUL
,
MLA
,
MLS
,
SMULL
,
SMLAL
,
UMAAL
,
UMULL
, and
UMLAL
.
Table 14-28 Permitted instruction combinations (continued)
Dual issue
case
First instruction
Second instruction