Intel® PXA27x Processor Family
Optimization Guide
2-3
Microarchitecture Overview
2.2.2.1
ARM* V5TE Instruction Execution
uses arrows to show the possible flow of instructions in the pipeline. Instruction
execution flows from the F1 pipestage to the RF pipestage. The RF pipestage issues a single
instruction to either the X1 pipestage or the MAC unit (multiply instructions go to the MAC, while
all others continue to X1). This means that M1 or X1 are idle.
After calculating the effective addresses in XI, all load and store instructions route to the memory
pipeline.
The ARM* V5TE branch and exchange (BX) instruction (used to branch between ARM* and
THUMB* code) causes the entire pipeline to be flushed. If the processor is in THUMB* mode the
ID pipestage dynamically expands each THUMB* instruction into a normal ARM* V5TE RISC
instruction and normal execution resumes.
2.2.2.2
Pipeline Stalls
Pipeline stalls can seriously degrade performance. The primary reasons for stalls are register
dependencies, load dependencies, multiple-cycle instruction latency, and unpredictable branches.
To help maximize performance, it is important to understand some of the ways to avoid pipeline
stalls. The following sections provide more detail on the nature of the pipeline and ways of
preventing stalls.
2.2.3
Main Execution Pipeline
2.2.3.1
F1 / F2 (Instruction Fetch) Pipestages
The job of the instruction fetch stages F1 and F2 is to present the next instruction to be executed to
the ID stage. Two important functional units residing within the F1 and F2 stages are the BTB and
IFU.
•
Branch Target Buffer (BTB)
The BTB provides a 128-entry dynamic branch prediction buffer. An entry in the BTB is
created when a B or BL instruction branch is taken for the first time. On sequential executions
of the branch instruction at the same address, the next instruction loaded into the pipeline is
predicted by the BTB. Once the branch type instruction reaches the X1 pipestage, its target
address is known. Execution continues without stalling if the target address is the same as the
BTB predicted address. If the address is different from the address that the BTB predicted, the
pipeline is flushed, execution starts at the new target address, and the branch’s history is
updated in the BTB.
•
Instruction Fetch Unit (IFU)
The IFU is responsible for delivering instructions to the instruction decode
(ID) pipestage. It
delivers one instruction word each cycle (if possible) to the ID. The instruction could come
from one of two sources: instruction cache or fetch buffers.
Summary of Contents for PXA270
Page 1: ...Order Number 280004 001 Intel PXA27x Processor Family Optimization Guide April 2004...
Page 10: ...x Intel PXA27x Processor Family Optimization Guide Contents...
Page 20: ...1 10 Intel PXA27x Processor Family Optimization Guide Introduction...
Page 30: ...2 10 Intel PXA27x Processor Family Optimization Guide Microarchitecture Overview...
Page 48: ...3 18 Intel PXA27x Processor Family Optimization Guide System Level Optimization...
Page 114: ...5 16 Intel PXA27x Processor Family Optimization Guide High Level Language Optimization...
Page 122: ...6 8 Intel PXA27x Processor Family Optimization Guide Power Optimization...
Page 143: ...Intel PXA27x Processor Family Optimization Guide Index 5 Index...
Page 144: ......