94
IBM z13s Technical Guide
Figure 3-10 shows how OOO core execution can reduce the run time of a program.
Figure 3-10 z13s in-order and out-of-order core execution improvements
The left side of Figure 3-10 shows an in-order core execution. Instruction 2 has a large delay
because of an L1 cache miss, and the next instructions wait until Instruction 2 finishes. In the
usual in-order execution, the next instruction waits until the previous instruction finishes.
Using OOO core execution, which is shown on the right side of Figure 3-10, Instruction 4 can
start its storage access and run while instruction 2 is waiting for data. This situation occurs
only if no dependencies exist between the two instructions. When the L1 cache miss is
solved, Instruction 2 can also start its run while Instruction 4 is running. Instruction 5 might
need the same storage data that is required by Instruction 4. As soon as this data is on L1
cache, Instruction 5 starts running at the same time. The z13 superscalar PU core can have
up to
10 instructions/operations running per cycle. This technology results in a shorter run
time.
Branch prediction
If the branch prediction logic of the microprocessor makes the wrong prediction, removing all
instructions in the parallel pipelines might be necessary. The wrong branch prediction is
expensive in a high-frequency processor design. Therefore, the branch prediction techniques
that are used are important to prevent as many wrong branches as possible.
For this reason, various history-based branch prediction mechanisms are used, as shown on
the in-order part of the z13s PU core logical diagram in Figure 3-9 on page 93. The branch
target buffer (BTB) runs ahead of instruction cache pre-fetches to prevent branch misses in
an early stage. Furthermore, a branch history table (BHT) in combination with a pattern
history table (PHT) and the use of tagged multi-target prediction technology branch prediction
offer a high branch prediction success rate.
The z13s microprocessor improves the branch prediction throughput by using the new branch
prediction and instruction fetch front end.
Instructions
1
2
3
4
5
6
7
Execution
Storage access
Dependency
L1 miss
Time
In-order core execution
L1 miss
Faster
millicode
execution
Instructions
1
2
3
4
5
6
7
Better
Instruction
Delivery
Out-of-order core execution
Time
Содержание z13s
Страница 2: ......
Страница 3: ...International Technical Support Organization IBM z13s Technical Guide June 2016 SG24 8294 00 ...
Страница 24: ...THIS PAGE INTENTIONALLY LEFT BLANK ...
Страница 164: ...136 IBM z13s Technical Guide ...
Страница 226: ...198 IBM z13s Technical Guide ...
Страница 256: ...228 IBM z13s Technical Guide ...
Страница 414: ...386 IBM z13s Technical Guide ...
Страница 464: ...436 IBM z13s Technical Guide ...
Страница 476: ...448 IBM z13s Technical Guide ...
Страница 498: ...470 IBM z13s Technical Guide ...
Страница 502: ...474 IBM z13s Technical Guide ...
Страница 568: ...540 IBM z13s Technical Guide ...
Страница 578: ...550 IBM z13s Technical Guide ...
Страница 584: ...556 IBM z13s Technical Guide ...
Страница 585: ...ISBN 0738441678 SG24 8294 00 1 0 spine 0 875 1 498 460 788 pages IBM z13s Technical Guide ...
Страница 586: ......
Страница 587: ......
Страница 588: ...ibm com redbooks Printed in U S A Back cover ISBN 0738441678 SG24 8294 00 ...