
1:188
Volume 1, Part 2: Software Pipelining and Loop Support
Note:
Rotating GRs have now been included in the code (the code directly preceding
did not). Also, induction variables that are post incremented must be allocated
to the static portion of the register file:
mov
lc = 199
// LC =loop count - 1
mov
ec = 4
// EC =epilog 1
mov
pr.rot = 1<<16;;
// PR16 = 1, rest = 0
L1:
(p16)
ld4
r32 = [r5],4
// Cycle 0
(p18)
add
r35 = r34,r9
// Cycle 0
(p19)
st4
[r6] = r36,4
// Cycle 0
br.ctop L1;;
// Cycle 0
The memory ports are fully utilized.
shows a trace of the execution of this
loop.
In cycle 3, the kernel phase is entered and the fourth iteration of the kernel loop
executes the
ld4
,
add
, and
st4
from the fourth, second, and first source iterations
respectively. By cycle 200, all 200 loads have been executed, and the epilog phase is
entered. When the
br.ctop
is executed in cycle 202,
EC
is equal to 1.
EC
is
decremented, the registers are rotated one last time, and execution falls out of the
kernel loop.
Note:
After this final rotation,
EC
and the stage predicates (
p16
-
p19
) are 0.
It is desirable to allocate variables that are loop variant to the rotating portion of the
register file whenever possible to preserve space in the static portion for loop invariant
variables. Induction variables that are post incremented must be allocated to the static
portion of the register file.
5.4.3.3
While Loop Branches
shows the flowchart for while loop branches.
Table 5-1.
ctop Loop Trace
Cycle
Port/Instructions
State before br.ctop
M
I
M
B
p16
p17
p18
p19
LC
EC
0
ld4
br.ctop
1
0
0
0
199
4
1
ld4
br.ctop
1
1
0
0
198
4
2
ld4
add
br.ctop
1
1
1
0
197
4
3
ld4
add
st4
br.ctop
1
1
1
1
196
4
…
…
…
…
…
…
…
…
…
…
…
100
ld4
add
st4
br.ctop
1
1
1
1
99
4
…
…
…
…
…
…
…
…
…
…
…
199
ld4
add
st4
br.ctop
1
1
1
1
0
4
200
add
st4
br.ctop
0
1
1
1
0
3
201
add
st4
br.ctop
0
0
1
1
0
2
202
st4
br.ctop
0
0
0
1
0
1
...
0
0
0
0
0
0
Содержание ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Страница 1: ......
Страница 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 12: ...1 1 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part I Application Architecture Guide ...
Страница 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Страница 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Страница 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Страница 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Страница 230: ......
Страница 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 249: ...2 1 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part I System Architecture Guide ...
Страница 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Страница 380: ...2 132 Volume 2 Part 1 Interruptions ...
Страница 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Страница 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Страница 749: ...2 501 Intel Itanium Architecture Software Developer s Manual Rev 2 3 Part II System Programmer s Guide ...
Страница 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Страница 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Страница 808: ...2 560 Volume 2 Part 2 Context Management ...
Страница 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Страница 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Страница 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Страница 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Страница 891: ......
Страница 941: ...3 42 Volume 3 Instruction Reference cmp illegal_operation_fault PR p1 0 PR p2 0 Interruptions Illegal Operation fault ...
Страница 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Страница 1191: ...3 292 Volume 3 Pseudo Code Functions Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Страница 1296: ......
Страница 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Страница 1564: ...4 262 Volume 4 Base IA 32 Instruction Reference LES Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1565: ...Volume 4 Base IA 32 Instruction Reference 4 263 LFS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1568: ...4 266 Volume 4 Base IA 32 Instruction Reference LGS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1583: ...Volume 4 Base IA 32 Instruction Reference 4 281 LSS Load Full Pointer See entry for LDS LES LFS LGS LSS ...
Страница 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Страница 1663: ...Volume 4 Base IA 32 Instruction Reference 4 361 SHL SHR Shift Instructions See entry for SAL SAR SHL SHR ...
Страница 1668: ...4 366 Volume 4 Base IA 32 Instruction Reference SIDT Store Interrupt Descriptor Table Register See entry for SGDT SIDT ...
Страница 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Страница 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Страница 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Страница 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...