Loop Unrolling
6-97
Optimizing Assembly Code via Linear Assembly
6.9.4
Determining the Minimum Iteration Interval
With 16 instructions, the minimum iteration interval is at least 3 because a
maximum of six instructions can be in parallel with the following allocation
possibilities:
-
LDH must be on a .D unit.
-
SHL, B, and MVK must be on a .S unit.
-
The ADDs and SUB can be on a .S, .L, or .D unit.
-
The AND can be on a .S or .L unit, or .D unit (’C64x only)
From Table 6–20, you can see that no one resource is used more than three
times so that the minimum iteration interval is still 3.
Checking the total number of non-.M instructions on each side shows that a
total of nine instructions can be performed with the minimum iteration interval
of 3. because only seven non-.M instructions are on the B side, the minimum
iteration interval is still 3.
Table 6–20. Resource Table for Unrolled If-Then-Else Code
(a) A side
(b) B side
Unit(s)
Instructions
Total/Unit
Unit(s)
Instructions
Total/Unit
.M1
0
.M2
0
.S1
MVK and 2 SHLs
3
.S2
MVK and B
2
.D1
2 LDHs
2
.L2
CMPEQ
1
.L1
CMPEQ
1
.L2 pr.S2
AND
1
.L1 or .S1
AND
1
.L2 ,.S2, or .D2
SUB and 2 ADDs
3
.L1, .S1, or .D1
ADD and SUB
2
Total non-.M units
9
Total non-.M units
7
6.9.5
Linear Assembly Resource Allocation
Now that the graph is split and you know the minimum iteration interval, you
can allocate functional units and registers to the instructions. You must ensure
no resource is used more than three times.
Example 6–54 shows the linear assembly code with the functional units and
registers.