
Volume 1, Part 1: Introduction to the Intel
®
Itanium
®
Architecture
1:19
2.8
Branching
In addition to removing branches through the use of predication, several mechanisms
are provided to decrease the branch misprediction rate and the cost of the remaining
mispredicted branches. These mechanisms provide ways for the compiler to
communicate information about branch conditions to the processor.
Branch predict instructions are provided which can be used to communicate an early
indication of the target address and the location of the branch. The compiler will try to
indicate whether a branch should be predicted dynamically or statically. The processor
can use this information to initialize branch prediction structures, enabling good
prediction even the first time a branch is encountered. This is beneficial for
unconditional branches or in situations where the compiler has information about likely
branch behavior.
For indirect branches, a branch register is used to hold the target address. Branch
predict instructions provide an indication of which register will be used in situations
when the target address can be computed early. A branch predict instruction can also
signal that an indirect branch is a procedure return, enabling the efficient use of
call/return stack prediction structures.
Special loop-closing branches are provided to accelerate counted loops and
modulo-scheduled loops. These branches and their associated branch predict
instructions provide information that allows for perfect prediction of loop termination,
thereby eliminating costly mispredict penalties and a reduction of the loop overhead.
2.9
Register Rotation
Modulo scheduling of a loop is analogous to hardware pipelining of a functional unit
since the next iteration of the loop starts before the previous iteration has finished. The
iteration is split into stages similar to the stages of an execution pipeline. Modulo
scheduling allows the compiler to execute loop iterations in parallel rather than
sequentially. The concurrent execution of multiple iterations traditionally requires
unrolling of the loop and software renaming of registers. The Itanium architecture
allows the renaming of registers which provide every iteration with its own set of
registers, avoiding the need for unrolling. This kind of register renaming is called
register rotation. The result is that software pipelining can be applied to a much wider
variety of loops
–
both small as well as large with significantly reduced overhead.
2.10
Floating-point Architecture
The Itanium architecture defines a floating-point architecture with full IEEE support for
the single, double, and double-extended (80-bit) data types. Some extensions, such as
a fused multiply and add operation, minimum and maximum functions, and a register
file format with a larger range than the double-extended memory format, are also
included. 128 floating-point registers are defined. Of these, 96 registers are rotating
(not stacked) and can be used to modulo schedule loops compactly. Multiple
floating-point status registers are provided for speculation.
Содержание ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Страница 1: ......
Страница 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 12: ...1 1 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part I Application Architecture Guide ...
Страница 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Страница 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Страница 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Страница 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Страница 230: ......
Страница 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 249: ...2 1 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part I System Architecture Guide ...
Страница 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Страница 380: ...2 132 Volume 2 Part 1 Interruptions ...
Страница 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Страница 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Страница 749: ...2 501 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part II System Programmer s Guide ...
Страница 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Страница 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Страница 808: ...2 560 Volume 2 Part 2 Context Management ...
Страница 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Страница 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Страница 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Страница 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Страница 891: ......
Страница 941: ...3 42 Volume 3 Instruction Reference cmp illegal_operation_fault PR p1 0 PR p2 0 Interruptions Illegal Operation fault ...
Страница 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Страница 1191: ...3 292 Volume 3 Pseudo Code Functions Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Страница 1296: ......
Страница 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Страница 1564: ...4 262 Volume 4 Base IA 32 Instruction Reference LES Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1565: ...Volume 4 Base IA 32 Instruction Reference 4 263 LFS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1568: ...4 266 Volume 4 Base IA 32 Instruction Reference LGS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1583: ...Volume 4 Base IA 32 Instruction Reference 4 281 LSS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Страница 1663: ...Volume 4 Base IA 32 Instruction Reference 4 361 SHL SHR Shift Instructions See entry for SAL SAR SHL SHR ...
Страница 1668: ...4 366 Volume 4 Base IA 32 Instruction Reference SIDT Store Interrupt Descriptor Table Register See entry for SGDT SIDT ...
Страница 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Страница 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Страница 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...