PowerPC e500 Core Family Reference Manual, Rev. 1
4-6
Freescale Semiconductor
Execution Timing
The widths of the execution units shown in
Figure 4-1
and
Figure 4-2
indicate whether a unit can
execute instructions with 64-bit operands. LSU, MU, and SU1 have upper and lower halves. Scalar
instructions use only the lower halves and update GPR bits 32–63.
Some instructions, such as loads and stores, access memory and require additional clock cycles
between the execute and write-back phases. Latencies may be greater if the access is to
noncacheable memory, causes a TLB miss, misses in the L1 cache, generates a write-back to
memory, causes a snoop hit from another device that generates additional activity, or encounters
other conditions that affect memory accesses.
The e500 can complete as many as two instructions on each clock cycle.
The instruction pipeline stages are described as follows:
•
Instruction fetch—Includes the clock cycles necessary to request an instruction and the
time the memory system takes to respond to the request. Fetched instructions are latched
into the instruction queue (IQ) for consideration by the dispatcher.
The fetcher tries to initiate a fetch in every cycle in which it is guaranteed that the IQ has
room for fetched instructions. Instructions are typically fetched from the L1 instruction
cache; if caching is disabled, instructions are fetched from the instruction line fill buffer
(ILFB), shown in
Figure 4-8
. Likewise, on a cache miss, as many as four instructions can
be forwarded to the fetch unit from the line-fill buffer as the cache line is passed to the
instruction cache.
Fetch timing is affected by many things, such as whether an instruction is in the on-chip
instruction cache or an L2 cache (if implemented). Those factors increase when it is
necessary to fetch instructions from system memory and include the processor-to-bus clock
ratio, the amount of bus traffic, and whether any cache coherency operations are required.
Fetch timing is also affected by whether effective address translation is available in a TLB,
as described in
Section 4.3.2.1, “L1 and L2 TLB Access Times
.”
•
The decode/dispatch stage fully decodes each instruction; most instructions are dispatched
to the issue queues, but isync, rfi, sc, nops, and others are not. Every dispatched instruction
is assigned a GPR rename register and a CR field rename register, even if they do not
specify a GPR or CR operand. There is a pair of GPR/CRF rename registers for each CQ
entry (even for instructions that do not access the CR or GPRs).
The two issue queues, BIQ and GIQ, can accept as many as one and two instructions,
respectively, in a cycle. Instruction dispatch requires the following:
— Instructions dispatch only from IQ0 and IQ1.
— As many as two instructions can be dispatched per clock cycle.
— Space must be available in the CQ for an instruction to decode and dispatch.
In this chapter, dispatch is treated as an event at the end of the decode stage. Dispatch
dependencies are described in
Section 4.7.2, “Dispatch Unit Resource Requirements
.”
Summary of Contents for PowerPC e500 Core
Page 1: ...PowerPC e500 Core Family Reference Manual Supports e500v1 e500v2 E500CORERM Rev 1 4 2005...
Page 36: ...PowerPC e500 Core Family Reference Manual Rev 1 xxxvi Freescale Semiconductor...
Page 38: ...PowerPC e500 Core Family Reference Manual Rev 1 Part I 2 Freescale Semiconductor...
Page 332: ...PowerPC e500 Core Family Reference Manual Rev 1 Part II 2 Freescale Semiconductor...
Page 530: ...Opcode Listings PowerPC e500 Core Family Reference Manual Rev 1 D 50 Freescale Semiconductor...
Page 534: ...PowerPC e500 Core Family Reference Manual Rev 1 E 4 Freescale Semiconductor Revision History...