1:190
Volume 1, Part 2: Software Pipelining and Loop Support
Induction Variable
Value that is incremented (or decremented) once per source iteration by
the same amount.
5.5
Optimization of Loops in the Intel
®
Itanium
®
Architecture
Register rotation, predication, and the software pipelined loop branches allow the
generation of compact, yet highly parallel code. Speculation can further increase loop
performance by removing dependency barriers that limit the throughput of software
pipelined loops. Register rotation removes the requirement that kernel loops be
unrolled to allow software renaming of the registers. However in some cases
performance can be increased by unrolling the source loop prior to software pipelining,
or by generating explicit prolog and/or epilog blocks. The remainder of this chapter
discusses loop optimizations.
5.5.1
While Loops
The programming scheme for while loops depends upon the structure of the loop. This
section discusses do-while loops, in which the loop condition is computed at the bottom
of the loop. Optimizing compilers often transform while loops (where the condition is
computed at the top of the loop) into do-while loops by moving the condition
computation to the bottom of the loop and placing a copy of the condition computation
prior to the loop to reduce the number of branches in the loop. The remainder of this
section refers to such loops simply as while loops. Below is a simple while loop:
L1:
ld4
r4 = [r5],4;;
// Cycle 0
st4
[r6] = r4,4
// Cycle 2
cmp.ne p1,p0 = r4,r0
// Cycle 2
(p1)
br
L1;;
// Cycle 2
A conceptual view of a pipelined iteration of this loop with II equal to one is shown
below:
stage 1:ld4
r4 = [r5],4
stage 2:---
// empty stage
stage 3:st4
[r6]= r4,4
cmp.ne.unc p1,p0 = r4,r0
(p1)
br
L1
The following is a conceptual view of four overlapped source iterations assuming the
load and store are independent memory references. The store, compare, and branch
instructions in stage two are represented by the pseudo-instruction
scb
:
1 2 3 4 Cycle
----------------------------------------------------
ld4
X
ld4.s
X+1
scb
ld4.s
X+2
scb
ld4.s
X+3
scb
X+4
scb
X+5
Summary of Contents for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Page 1: ......
Page 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Page 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Page 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Page 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Page 230: ......
Page 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Page 380: ...2 132 Volume 2 Part 1 Interruptions ...
Page 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Page 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Page 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Page 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Page 808: ...2 560 Volume 2 Part 2 Context Management ...
Page 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Page 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Page 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Page 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Page 891: ......
Page 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Page 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Page 1296: ......
Page 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Page 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Page 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Page 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Page 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...