
1:186
Volume 1, Part 2: Software Pipelining and Loop Support
for the same source iteration. Each one written to
p16
sequentially enables all the
stages for a new source iteration. This behavior is used to enable or disable the
execution of the stages of the pipelined loop during the prolog, kernel, and epilog
phases as described in the next section.
5.4.2
Note on Initializing Rotating Predicates
In this chapter, the instruction
mov pr.rot = immed
is used to initialize rotating
predicates. This instruction ignores the value of CFM.rrb.pr. Thus, the examples in this
chapter are written assuming that CFM.rrb.pr is always zero prior to the initialization of
predicate registers using
mov pr.rot = immed
.
5.4.3
Software-pipelined Loop Branches
The special software-pipelined loop branches allow the compiler to generate very
compact code for software-pipelined loops by supporting register rotation and by
controlling the filling and draining of the software pipeline during the prolog and epilog
phases. Generally speaking, each time a software-pipelined loop branch is executed,
the following actions take place:
1. A decision is made on whether or not to continue kernel loop execution.
2.
p16
is set to a value to control execution of the stages of the software pipeline
(
p63
is written by the branch, and after rotation this value will be in
p16
).
3. The registers are rotated (rrb registers are decremented).
4. The Loop Count (
LC
) and/or the Epilog Count (
EC
) application registers are
selectively decremented.
There are two types of software-pipelined loop branches: counted and while.
5.4.3.1
Counted Loop Branches
shows a flowchart for modulo-scheduled counted loop branches.
During the prolog and kernel phase, a decision to continue kernel loop execution means
that a new source iteration is started. Register rotation must occur so that the new
source iteration does not overwrite registers that are in use by prior source iterations
that are still in the pipeline.
p16
is set to 1 to enable the stages of the new source
iteration.
LC
is decremented to update the count of remaining source iterations.
EC
is
not modified.
During the epilog phase, the decision to continue loop execution means that the
software pipeline has not yet been fully drained and execution of the source iterations
in progress must continue. Register rotation must continue because the remaining
source iterations are still writing results and the consumers of the results expect
rotation to occur.
p16
is now set to 0 because there are no more new source iterations
and the instructions that correspond to non-existent source iterations must be disabled.
EC
contains the count of the remaining execution stages for the last source iteration
and is decremented during the epilog. For most loops, when a software pipelined loop
branch is executed with
EC
equal to 1, it indicates that the pipeline has been drained
Содержание ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Страница 1: ......
Страница 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 12: ...1 1 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part I Application Architecture Guide ...
Страница 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Страница 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Страница 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Страница 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Страница 230: ......
Страница 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 249: ...2 1 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part I System Architecture Guide ...
Страница 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Страница 380: ...2 132 Volume 2 Part 1 Interruptions ...
Страница 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Страница 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Страница 749: ...2 501 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part II System Programmer s Guide ...
Страница 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Страница 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Страница 808: ...2 560 Volume 2 Part 2 Context Management ...
Страница 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Страница 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Страница 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Страница 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Страница 891: ......
Страница 941: ...3 42 Volume 3 Instruction Reference cmp illegal_operation_fault PR p1 0 PR p2 0 Interruptions Illegal Operation fault ...
Страница 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Страница 1191: ...3 292 Volume 3 Pseudo Code Functions Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Страница 1296: ......
Страница 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Страница 1564: ...4 262 Volume 4 Base IA 32 Instruction Reference LES Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1565: ...Volume 4 Base IA 32 Instruction Reference 4 263 LFS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1568: ...4 266 Volume 4 Base IA 32 Instruction Reference LGS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1583: ...Volume 4 Base IA 32 Instruction Reference 4 281 LSS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Страница 1663: ...Volume 4 Base IA 32 Instruction Reference 4 361 SHL SHR Shift Instructions See entry for SAL SAR SHL SHR ...
Страница 1668: ...4 366 Volume 4 Base IA 32 Instruction Reference SIDT Store Interrupt Descriptor Table Register See entry for SGDT SIDT ...
Страница 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Страница 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Страница 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...