Instruction Cache
4-19
Memory and the Instruction Cache
4.3
Instruction Cache
A 64
×
32-bit instruction cache speeds instruction fetches and lowers system
cost by caching program fetches from external memory. The instruction cache
allows the use of slow, external memories while still achieving single-cycle access
performances. This reduces the number of off-chip accesses necessary and
allows code to be stored off-chip in slower, lower-cost memories. The cache
also frees external buses from program fetches so that they can be used by
the DMA or other system elements.
The cache can operate automatically, with no user intervention. Subsection
4.3.2 describes a form of the least recently used (LRU) cache update algorithm.
4.3.1
Instruction-Cache Architecture
The instruction cache (see Figure 4–12) contains 64 32-bit words of RAM; it
is divided into two 32-word segments. A 19-bit segment start address (SSA)
register is associated with each segment. For each word in the cache, there
is a corresponding single bit-present (P) flag.
When the CPU requests an instruction word from external memory, the cache
algorithm checks to determine if the word is already contained in the instruction
cache. Figure 4–11 shows how the cache-control algorithm partitions an
instruction address. The algorithm uses the19 most significant bits (MSBs) of
the instruction address to select the segment; the five least significant bits
(LSBs) define the address of the instruction word within the pertinent segment.
The algorithm compares the 19 MSBs of the instruction address with the two
SSA registers. If there is a match, the algorithm checks the relevant P flag. The
P flag indicates if a word within a particular segment is already present in cache
memory:
-
P = 1: the word is already present in cache memory
-
P = 0: the location cache is invalid
Figure 4–11.Address Partitioning for Cache Control Algorithm
Instruction word
address within segment
Segment start address
(SSA)
5 4
23
0
If there is no match, one of the segments must be replaced by the new data. The
segment replaced in this circumstance is determined by the LRU algorithm. The
LRU stack (see Figure 4–12) is maintained for this purpose.