4-26
Intel® PXA27x Processor Family
Optimization Guide
Intel XScale® Microarchitecture & Intel® Wireless MMX™ Technology Optimization
The following code sequence illustrates the set-up processo for an unaligned array access. The
procedure involves loading one of the general purpose registers on Intel® Wireless MMX™
Technology with the lower 3-bits of the address pointer and then clearing the LSBs so that future
accesses to the array are on a 64-bit boundary.
;r0 -> pointer to misaligned array.
MOV r5,#7
AND r7,r0,r5
TMCR wCGR1, r7
SUB r0,r0,r7;r0 psrc 64 bit aligned
Following the initial setup for alignment, the data can now be accessed, aligned, and presented to
the execution resources.
WLDRD wR0, [r0]
WLDRD wR1, [r0,#8]
WALIGNR1 wR4, wR0, wR1
In the above code sequence it is also necessary to interleave additional operations to avoid the
back-to-back WLDRD and load-to-use penalties.
4.5
Porting Existing Intel® MMX™ Technology Code to
Intel® Wireless MMX™ Technology
The re-use of existing Intel® MMX™ Technology code is encouraged since algorithm mapping to
Intel® Wireless MMX™ Technology may be significantly accelerated. The Intel® MMX™
Technology target pipeline and architecture is different than Intel® Wireless MMX™ Technology
and several changes are required for optimal mapping. The algorithms may require some re-design
and attention to several aspects will make the task more manageable
•
Data width – Intel® MMX™ Technology uses different designators for data types:
— Packed words for 16-bit operands, Intel® Wireless MMX™ Technology uses halfword
(H)
— Packed double words for 32-bit operands, Intel® Wireless MMX™ Technology uses
word (W)
— Quadwords for 64-bit operands, Intel® Wireless MMX™ Technology used doubleword
(D)
•
Instruction latencies – Instruction latencies are different with Intel® Wireless MMX™
Technology. May need to alter the scheduling of instructions.
•
Instruction pairing – Intel® MMX™ Technology interleaves with x86 to reduce stalls. May
need alter the pairing of instructions in some cases on Intel® Wireless MMX™ Technology.
•
Operand alignment – DWORD load/store requires 64-bit alignment. The pointers must be on a
64b boundary to avoid an exception.
•
Memory latency – Memory latency for the PXA27x processor is different than existing Intel®
MMX™ Technology.
Summary of Contents for PXA270
Page 1: ...Order Number 280004 001 Intel PXA27x Processor Family Optimization Guide April 2004...
Page 10: ...x Intel PXA27x Processor Family Optimization Guide Contents...
Page 20: ...1 10 Intel PXA27x Processor Family Optimization Guide Introduction...
Page 30: ...2 10 Intel PXA27x Processor Family Optimization Guide Microarchitecture Overview...
Page 48: ...3 18 Intel PXA27x Processor Family Optimization Guide System Level Optimization...
Page 114: ...5 16 Intel PXA27x Processor Family Optimization Guide High Level Language Optimization...
Page 122: ...6 8 Intel PXA27x Processor Family Optimization Guide Power Optimization...
Page 143: ...Intel PXA27x Processor Family Optimization Guide Index 5 Index...
Page 144: ......