© Koninklijke Philips Electronics N.V. 2006. All rights reserved.
User manual
Rev. 01 — 12 January 2006
39
4.1 Introduction
The MAM block in the LPC2101/02/03 maximizes the performance of the ARM processor
when it is running code in flash memory using a single flash bank.
4.2 Operation
Simply put, the Memory Accelerator Module (MAM) attempts to have the next ARM
instruction that will be needed in its latches in time to prevent CPU fetch stalls. The
LPC2101/02/03 uses one bank of Flash memory, compared to the two banks used on
predecessor devices. It includes three 128-bit buffers called the Prefetch Buffer, the
Branch Trail Buffer and the Data Buffer. When an Instruction Fetch is not satisfied by
either the Prefetch or Branch Trail buffer, nor has a prefetch been initiated for that line, the
ARM is stalled while a fetch is initiated for the 128-bit line. If a prefetch has been initiated
but not yet completed, the ARM is stalled for a shorter time. Unless aborted by a data
access, a prefetch is initiated as soon as the Flash has completed the previous access.
The prefetched line is latched by the Flash module, but the MAM does not capture the line
in its prefetch buffer until the ARM core presents the address from which the prefetch has
been made. If the core presents a different address from the one from which the prefetch
has been made, the prefetched line is discarded.
The Prefetch and Branch Trail Buffers each include four 32-bit ARM instructions or eight
16-bit Thumb instructions. During sequential code execution, typically the prefetch buffer
contains the current instruction and the entire Flash line that contains it.
The MAM uses the LPROT[0] line to differentiate between instruction and data accesses.
Code and data accesses use separate 128-bit buffers. 3 of every 4 sequential 32-bit code
or data accesses "hit" in the buffer without requiring a Flash access (7 of 8 sequential
16-bit accesses, 15 of every 16 sequential byte accesses). The fourth (eighth, 16th)
sequential data access must access Flash, aborting any prefetch in progress. When a
Flash data access is concluded, any prefetch that had been in progress is re-initiated.
Timing of Flash read operations is programmable and is described later in this section.
In this manner, there is no code fetch penalty for sequential instruction execution when the
CPU clock period is greater than or equal to one fourth of the Flash access time. The
average amount of time spent doing program branches is relatively small (less than 25%)
and may be minimized in ARM (rather than Thumb) code through the use of the
conditional execution feature present in all ARM instructions. This conditional execution
may often be used to avoid small forward branches that would otherwise be necessary.
Branches and other program flow changes cause a break in the sequential flow of
instruction fetches described above. The Branch Trail Buffer captures the line to which
such a non-sequential break occurs. If the same branch is taken again, the next
instruction is taken from the Branch Trail Buffer. When a branch outside the contents of
the Prefetch and Branch Trail Buffer is taken, a stall of several clocks is needed to load the
Branch Trail Buffer. Subsequently, there will typically be no further instruction fetch delays
until a new and different branch occurs.
UM10161
Chapter 4: Memory Acceleration Module (MAM)
Rev. 01 — 12 January 2006
User manual