Software Pipelining
6-50
Example 6–29. Assembly Code for Floating-Point Dot Product (Software Pipelined
With No Extraneous Loads) (Continued)
ADDSP
.L1X
A8,B8,A0
; sum(0) = sum0(0) + sum1(0)
ADDSP
.L2X
A8,B8,B0
; sum(1) = sum0(1) + sum1(1)
ADDSP
.L1X
A8,B8,A0
; sum(2) = sum0(2) + sum1(2)
ADDSP
.L2X
A8,B8,B0
; sum(3) = sum0(3) + sum1(3)
NOP
; wait for B0
ADDSP
.L1X
A0,B0,A5
; sum(01) = sum(0) + sum(1)
NOP
; wait for next B0
ADDSP
.L2X
A0,B0,B5
; sum(23) = sum(2) + sum(3)
NOP
3
ADDSP
.L1X
A5,B5,A4
; sum = sum(01) + sum(23)
NOP
3
;