Volume 1, Part 2: Predication, Control Flow, and Instruction Stream
1:163
Predication, Control Flow, and Instruction
Stream
4
4.1
Overview
This chapter is divided into three sections that describe optimizations related to
predication, control flow, and branch hints as follows:
• The
predication
section describes if-conversion, predicate usage, and code
scheduling to reduce the affects of branching.
• The
control flow optimization
section describes optimizations that collapse and
converge control flow by using parallel compares, multiway branches, and multiple
register writers under predicate.
• The
branch and prefetch hints
section describes how hints are used to improve
branch and prefetch performance.
4.2
Predication
Predication allows the compiler to convert control dependencies into data
dependencies. This section describes several sources of branch-related performance
considerations, followed by a summary of predication mechanism, followed by a series
of descriptions of optimizations and techniques based on predication.
4.2.1
Performance Costs of Branches
Branches can decrease application performance by consuming hardware resources for
prediction at execution time and by restricting instruction scheduling freedom during
compilation.
4.2.1.1
Prediction Resources
Branch prediction resources include branch target buffers, branch prediction tables, and
the logic used to control these resources. The number of branches that can accurately
be predicted is limited by the size of the buffers on the processor, and such buffers tend
to be small relative to the total number of branches executed in a program.
This limitation means that branch intensive code may have a large portion of its
execution time spent due to contention for prediction resources. Furthermore, even
though the size of the predictors is a primary factor in determining branch prediction
performance, some branches are best predicted with different types of predictors. For
example, some branches are best predicted statically while others are more suitably
predicted dynamically. Of those predicted dynamically, some are of greater importance
than others, such as loop branches.
Summary of Contents for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Page 1: ......
Page 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Page 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Page 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Page 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Page 230: ......
Page 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Page 380: ...2 132 Volume 2 Part 1 Interruptions ...
Page 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Page 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Page 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Page 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Page 808: ...2 560 Volume 2 Part 2 Context Management ...
Page 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Page 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Page 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Page 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Page 891: ......
Page 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Page 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Page 1296: ......
Page 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Page 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Page 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Page 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Page 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...