4-22
Intel® PXA27x Processor Family
Optimization Guide
Intel XScale® Microarchitecture & Intel® Wireless MMX™ Technology Optimization
or, in C-code,
for (i = 0; i < N; i++) {
s = 0;
for (j = 0; j < T; j++) {
s += a[j]*x[i-j]);
}
y[i] = round (s);
}
The WMAC instruction is utilized for this calculation and provides for four parallel 16-bit by 16-
bit multiplications with accumulation. The first level of unrolling is a direct function of the four-
way SIMD instruction that is used to implement the filter.
The C-code for the real block FIR filter is re-written to illustrate that 4-taps are computed for each
loop iteration.
for (i = 0; i < N; i++) {
s0= 0;
for (j = 0; j < T/4; j++4) {
s0 += a[j]*x[i+j];
s0 += a[j+1]*x[i+j+1];
s0 += a[j+2]*x[i+j+2];
s0 += a[j+3]*x[i+j+3];
}
y[i] = round (s0);
}
The direct assembly code implementation of the inner loop illustrates clearly that optimum
execution has not been accomplished. In the following code sequence we have several undesirable
stalls. The back-to-back LDRD instructions incur a 1 cycle stall, the load-to-use penalty incurs a
3 cycle stall. In addition, the loop overhead is high with 2 cycles being consumed for every
fourtaps.
; Pointers r0 -> val , r1 -> pResult, r2 -> pTapsQ15 r3 -> tapsLen
WZERO wR15
Loop_Begin:
WLDRD wR0, [r2], #8
WLDRD wR1, [r4], #8
∑
−
=
−
≤
≤
∨
−
⋅
=
1
0
1
0
),
(
)
(
L
i
i
N
n
i
n
x
c
n
y
Содержание PXA270
Страница 1: ...Order Number 280004 001 Intel PXA27x Processor Family Optimization Guide April 2004...
Страница 10: ...x Intel PXA27x Processor Family Optimization Guide Contents...
Страница 20: ...1 10 Intel PXA27x Processor Family Optimization Guide Introduction...
Страница 30: ...2 10 Intel PXA27x Processor Family Optimization Guide Microarchitecture Overview...
Страница 48: ...3 18 Intel PXA27x Processor Family Optimization Guide System Level Optimization...
Страница 114: ...5 16 Intel PXA27x Processor Family Optimization Guide High Level Language Optimization...
Страница 122: ...6 8 Intel PXA27x Processor Family Optimization Guide Power Optimization...
Страница 143: ...Intel PXA27x Processor Family Optimization Guide Index 5 Index...
Страница 144: ......