Cycle Timings and Interlock Behavior
ARM DDI 0301H
Copyright © 2004-2009 ARM Limited. All rights reserved.
16-2
ID012310
Non-Confidential, Unrestricted Access
16.1
About cycle timings and interlock behavior
Complex instruction dependencies and memory system interactions make it impossible to
describe briefly the exact cycle timing behavior for all instructions in all circumstances. The
timings that this chapter describes are accurate in most cases. If precise timings are required you
must use a cycle-accurate model of the processor.
Unless otherwise stated, cycle counts and result latencies that this chapter describes are best case
numbers. They assume:
•
no outstanding data dependencies between the current instruction and a previous
instruction
•
the instruction does not encounter any resource conflicts
•
all data accesses hit in the MicroTLB and Data Cache, and do not cross protection region
boundaries
•
all instruction accesses hit in the Instruction Cache.
This section describes:
•
Changes in instruction flow overview
•
Instruction execution overview
on page 16-3
•
Conditional instructions
on page 16-4
•
Opposite condition code checks
on page 16-4
•
Definition of terms
on page 16-5.
16.1.1
Changes in instruction flow overview
To minimize the number of cycles, because of changes in instruction flow, the processor
includes a:
•
dynamic branch predictor
•
static branch predictor
•
return stack.
The dynamic branch predictor is a 128-entry direct-mapped branch predictor using VA bits
[9:3]. The prediction scheme uses a two-bit saturating counter for predictions that are:
•
Strongly Not Taken
•
Weakly Not Taken
•
Weakly Taken
•
Strongly Taken.
Only branches with a constant offset are predicted. Branches with a register-based offset are not
predicted. A dynamically predicted branch can be folded out of the instruction stream if the
following instruction arrives while the branch is within the prefetch instruction buffer. A
dynamically predicted branch takes one cycle or zero cycles if folded out.
The static branch predictor operates on branches with a constant offset that are not predicted by
the dynamic branch predictor. Static predictions are issued from the Iss stage of the main
pipeline, consequently a statically predicted branch takes four cycles.
The return stack consists of three entries, and as with static predictions, issues a prediction from
the Iss stage of the main pipeline. The return stack mispredicts if the value taken from the return
stack is not the value that is returned by the instruction. Only unconditional returns are