PowerPC e500 Core Family Reference Manual, Rev. 1
1-14
Freescale Semiconductor
Core Complex Overview
organized as 128 rows with 4-way set associativity, that holds the address and target
instruction of the 512 most-recently taken branches.
Table 1-5
lists BTB instructions.
1.5
Instruction Flow
The e500 core is a pipelined, superscalar processor with parallel execution units that allow
instructions to execute out of order but record their results in order. Pipelining breaks instruction
processing into discrete stages, so multiple instructions in an instruction sequence can occupy the
successive stages: as an instruction completes one stage, it passes to the next, leaving the previous
stage available to a subsequent instruction. So, even though it may take multiple cycles for an
instruction to pass through all of the pipeline stages, once a pipeline is full, instruction throughput
is much shorter than the latency.
A superscalar processor is one that issues multiple independent instructions into separate
execution units, allowing parallel execution. The e500 core has five execution units, one each for
branch (BU), load/store (LSU), and multiple-cycle operations (MU), and two for simple arithmetic
operations (SU1 and SU2). The MU and SU1 arithmetic execution units also execute 64-bit SPE
vector instructions, using both the lower and upper halves of the 64-bit GPRs.
The parallel execution units allow multiple instructions to execute in parallel and out of order. For
example, a low-latency addition instruction that is issued to an SU after an integer divide is issued
to the MU should finish executing before the higher latency divide instruction. The add instruction
can make its results available to a subsequent instruction, but it cannot update the architected GPR
specified as its target operand ahead of the multiple-cycle divide instruction.
1.5.1
Initial Instruction Fetch
The e500 core begins execution at fixed virtual address 0xFFFF_FFFC. The MMU has a default
page translation which maps this to the identical physical address. So, the instruction at physical
address 0xFFFF_FFFC must be a branch to another address within the 4-Kbyte boot page.
1.5.2
Branch Detection and Prediction
To improve branch performance, the e500 provides implementation-specific dynamic branch
prediction using the BTB to resolve branch instructions and improve the accuracy of branch
predictions. Each of the 512 entries in the 4-way set associative address cache of branch target
addresses includes a 2-bit saturating branch history counter, whose value is incremented or
decremented depending on whether the branch was taken. These bits can take on four values
Table 1-5. BTB Locking APU Instructions
Name
Mnemonic
Syntax
Branch Buffer Load Entry and Lock Set
bblels
—
Branch Buffer Entry Lock Reset
bbelr
—
Summary of Contents for PowerPC e500 Core
Page 1: ...PowerPC e500 Core Family Reference Manual Supports e500v1 e500v2 E500CORERM Rev 1 4 2005...
Page 36: ...PowerPC e500 Core Family Reference Manual Rev 1 xxxvi Freescale Semiconductor...
Page 38: ...PowerPC e500 Core Family Reference Manual Rev 1 Part I 2 Freescale Semiconductor...
Page 332: ...PowerPC e500 Core Family Reference Manual Rev 1 Part II 2 Freescale Semiconductor...
Page 530: ...Opcode Listings PowerPC e500 Core Family Reference Manual Rev 1 D 50 Freescale Semiconductor...
Page 534: ...PowerPC e500 Core Family Reference Manual Rev 1 E 4 Freescale Semiconductor Revision History...