Sun Microelectronics
268
UltraSPARC User’s Manual
The static bit provided by BPcc and FBPfcc instructions is used to set the state
machine in either the likely taken state or the likely not taken state (Figure 16-6).
For branches without prediction (Bicc, FBfcc), UltraSPARC initializes the state
machine to likely not taken. Notice that a branch initialized to likely taken does
not produce a correct next field for the immediately following I-Cache fetch, since
it takes one extra cycle to generate the correct address (branch offset added to the
PC). This results in two lost cycles for fetching instructions, which does not nec-
essarily lead to a pipeline stall. This penalty is much less than the mispredicted
branch penalty (4 cycles) that would occur if the branch prediction bit was al-
ways ignored and a static prediction was used (e.g. always taken). The state ma-
chine representing the algorithm used for branch prediction is represented in
Figure 16-6. (Note: This figure is identical to Figure A-15.)
Figure 16-6
Dynamic Branch Prediction State Diagram
For loops in steady state, the algorithm is designed so that it requires two mis-
predictions in order for the prediction to be changed from taken to not taken.
Each loop exit will thus cause a single misprediction (versus two for a one-bit dy-
namic scheme).
16.2.6.1 Impact of the Annulled Slot
Grouping rules in Chapter 17, “Grouping Rules and Stalls,” describe how
UltraSPARC handles instructions following an annulling branch. The key things
to keep in mind regarding these instructions are:
1.
Avoid scheduling multicycle instructions in the delay slot (for example,
IMUL, IDIV, etc.).
PT/ANT
PT/AT
PNT/AT
ST
LT
LNT
SNT
PT,AT
PT/ANT
PNT/AT
PNT/ANT
PNT/ANT
Initialization
PT:
Predicted Taken
PNT: Predicted Not Taken
AT:
Actual Taken
ANT: Actual Not Taken
ST:
Strongly Taken
LT:
Likely Taken
SNT: Strongly Not Taken
LNT: Likely Not Taken
Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com