Software Pipelining
6-36
Floating-Point Example
Table 6–8 shows a fully pipelined schedule for the floating-point dot product
example.
Table 6–8. Modulo Iteration Interval Table for Floating-Point Dot Product
(After Software Pipelining)
Loop Prolog
Unit /
Cycle
0
1
2
3
4
5
6
7
8
9, 10, 11...
.D1
LDDW
*
LDDW
**
LDDW
***
LDDW
****
LDDW
*****
LDDW
******
LDDW
*******
LDDW
********
LDDW
*********
LDDW
.D2
LDDW
*
LDDW
**
LDDW
***
LDDW
****
LDDW
*****
LDDW
******
LDDW
*******
LDDW
********
LDDW
*********
LDDW
.M1
MPYSP
*
MPYSP
**
MPYSP
***
MPYSP
****
MPYSP
.M2
MPYSP
*
MPYSP
**
MPYSP
***
MPYSP
****
MPYSP
.L1
ADDSP
.L2
ADDSP
.S1
SUB
*
SUB
**
SUB
***
SUB
****
SUB
*****
SUB
******
SUB
.S2
B
*
B
**
B
***
B
****
B
*****
B
ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁ
ÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁÁ
Note:
The asterisks indicate the iteration of the loop; shading indicates the single-cycle loop.
The rightmost column in Table 6–8 is a single-cycle loop that contains the
entire loop. Cycles 0–8 are loop setup code, or loop prolog.
Asterisks define which iteration of the loop the instruction is executing each
cycle. For example, the rightmost column shows that on any given cycle inside
the loop:
-
The ADDSP instructions are adding data for iteration
n.
-
The MPYSP instructions are multiplying data for iteration n + 4 (****).
-
The LDDW instructions are loading data for iteration n + 9 (*********).
-
The SUB instruction is executing for iteration n + 6
(******).
-
The B instruction is executing for iteration n + 5 (*****).