Loop Carry Paths
6-85
Optimizing Assembly Code via Linear Assembly
6.7.8
Final Assembly
Example 6–46 shows the final assembly for the IIR filter. With one load of y[0]
outside the loop, no other loads from the y array are needed. Example 6–46
requires 408 cycles: (4
100) + 8.
Example 6–46. Assembly Code for IIR Filter
LDH
.D1
*A4++,A2
; xi
LDH
.D1
*A4,A3
;
xi+1
LDH
.D2
*B4++,B2
; load y[0] outside of loop
MVK
.S1
100,A1
; set up loop counter
LDH
.D1
*A4++,A2
;* xi
[A1] SUB
.L1
A1,1,A1
; decrement loop counter
||
MPY
.M1
A6,A2,A5
; c1 * xi
||
LDH
.D1
*A4,A3
;* xi+1
MPY
.M1X
B6,A3,A7
; c2 * xi+1
||[A1] B
.S1
LOOP
; branch to loop
MPY
.M2X
A8,B2,B3
; c3 * yi
LOOP:
ADD
.L1
A5,A7,A9
; c1 * xi + c2 * xi+1
||
LDH
.D1
*A4++,A2
;** xi
ADD
.L2X
B3,A9,B5
; c1 * xi + c2 * xi+1 + c3 * yi
||[A1] SUB
.L1
A1,1,A1
;* decrement loop counter
||
MPY
.M1
A6,A2,A5
;* c1 * xi
||
LDH
.D1
*A4,A3
;** xi+1
SHR
.S2
B5,15,B2
; yi+1
||
MPY
.M1X
B6,A3,A7
;* c2 * xi+1
||[A1] B
.S1
LOOP
;* branch to loop
STH
.D2
B2,*B4++
; store yi+1
||
MPY
.M2X
A8,B2,B3
;* c3 * yi
; Branch occurs here