VFP Instruction Execution
ARM DDI 0301H
Copyright © 2004-2009 ARM Limited. All rights reserved.
21-5
ID012310
Non-Confidential, Unrestricted Access
21.4
Forwarding
In general, any forwarding operation reduces the stall time of a dependent instruction by one
cycle. The VFP11 coprocessor forwards data from load instructions to CDP instructions and
from CDP instructions to CDP instructions.
The VFP11 coprocessor does not forward in the following cases:
•
from an instruction that produces integer data
•
to a store instruction, FST, FSTM, MRC, or MRRC
•
to an instruction of different precision.
In the examples that follow, the stall counts given are based on two data transfer assumptions:
•
accesses by load operations result in cache hits and are able to deliver one or two data
words per cycle
•
store operations write directly to the write buffer or cache and can transfer one or two data
words per cycle.
When these assumptions are valid, the VFP11 coprocessor operates at its highest performance.
When these assumptions are not valid, load and store operations are affected by the delay
required to access data. Example 21-1, Example 21-2 and Example 21-3 illustrate the
capabilities of the VFP11 coprocessor in ideal conditions.
In Example 21-1, the second FADDS instruction depends on the result of the first FADDS
instruction. The result of the first FADDS instruction is forwarded, reducing the stall from eight
cycles to seven cycles.
Example 21-1 Data forwarded to dependent instruction
FADDS S1, S2, S3
FADDS S8, S9, S1
In Example 21-2, there is no data forwarding of the double-precision FMULD data in D2 to the
single-precision FADDS data in S5, even though S5 is the upper half of D2.
Example 21-2 Mixed-precision data not forwarded
FMULD D2, D0, D1
FADDS S12, S13, S5
In Example 21-3, the double-precision FSTD stalls for eight cycles until the result of the
FMULD is written to the register file. No forwarding is done from the FMULD to the store
instruction.
Example 21-3 Data not forwarded to store instruction
FMULD D1, D2, D3
FSTD D1, [Rx]