Instruction
F1
F2
D1
D2
R1
R2
E
W
R1
R2
E1
E2
E3
MPYF32 R6H, R5H, R0H
|| MOV32 *XAR7++, R4H
I2
F32TOUI16R R3H, R4H
I2
I1
ADDF32 R3H, R2H, R0H
|| MOV32 *--SP, R2H
I4
MOV32 @XAR3, R6H
I4
I3
I2
I1
I4
I3
I2
I1
I4
I3
I2
I1
I4
I3
I2
I1
I4
I3
I2
I1
I4
I3
I2
I1
(
STALL
)
I4 samples the result as it enters
the R2 phase, but I1 is stalled in
E3 and is unable to forward the
product of R5H*R0H to I4 (R6H does
not have the product yet due to a
design bug). So, I4 reads the old
value of R6H.
I4
I3
I2
I1
There is no change in the pipeline
as it was stalled in the previous
cycle. I4 had already sampled the
old value of R6H in the previous
cycle.
I4
I3
I2
Stall over
I3
I3
I2
I1
Comments
FPU pipeline-->
I1
I1
Usage Notes and Known Design Exceptions to Functional Specifications
27
SPRZ412K – December 2013 – Revised February 2020
Copyright © 2013–2020, Texas Instruments Incorporated
TMS320F2837xD Dual-Core MCUs Silicon Revisions C, B, A, 0
shows the pipeline diagram of the issue if there is a stall in the E3 slot of the
instruction I1.
Figure 5. Pipeline Diagram of the Issue if There is a Stall in the E3 Slot of the Instruction I1
Workaround(s)
Treat MPYF32, ADDF32, SUBF32, and MACF32 in this scenario as 3p-cycle
instructions. Three NOPs or non-conflicting instructions must be placed in the delay slot
of the instruction.
The C28x Code Generation Tools v.6.2.0 and later will both generate the correct
instruction sequence and detect the error in assembly code. In previous versions, v6.0.5
(for the 6.0.x branch) and v.6.1.2 (for the 6.1.x branch), the compiler will generate the
correct instruction sequence but the assembler will not detect the error in assembly
code.
Example of Workaround:
MPYF32 R6H, R5H, R0H
|| MOV32 *XAR7++, R4H
; 3p FPU instruction that writes to R6H
F32TOUI16R R3H, R4H
; delay slot
ADDF32 R2H, R2H, R0H
|| MOV32 *--SP, R2H
; delay slot
NOP
; alignment cycle
MOV32 @XAR3, R6H
; FPU register read of R6H
shows the pipeline diagram with the workaround in place.