![Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3 Manual Download Page 168](http://html.mh-extra.com/html/intel/itanium-architecture-software-developers-volume-3-rev-2-3/itanium-architecture-software-developers-volume-3-rev-2-3_manual_2073404168.webp)
Volume 1, Part 2: Memory Reference
1:157
3.5
Optimization of Memory References
Speculation can increase parallelism and help to hide latency by enabling more code
motion than can be performed on traditional architectures. Speculation can increase the
application of traditional loop optimizations such as invariant code motion and common
subexpression elimination. The Itanium architecture also offers post-increment loads
and stores that improve instruction throughput without increasing code size.
Memory reference optimization should take several factors into account including:
• Difference between the execution costs of speculative and non-speculative code.
• Code size.
• Interference probabilities and properties of the ALAT (for data speculation).
The remainder of this chapter discusses these factors and optimizations relating to
memory accesses.
3.5.1
Speculation Considerations
The use of data speculation requires more attention than the use of control speculation.
In part this is due to the fact that one control speculative load cannot inadvertently
cause another control speculative load to fail. Such an effect is possible with data
speculative loads since the ALAT has limited capacity and the replacement policy of
ALAT entries is implementation dependent. For example, if an advanced load is issued
and there are no unused ALAT entries, the hardware may choose to invalidate an
existing entry to make room for a new one.
Moreover, exceptions associated with control speculative calculations are uncommon in
correct code since they are related to events such as page faults and TLB misses.
However, excessive control speculation can be expensive as associated instructions fill
issue slots.
Although the static critical path of a program may be reduced by the use of data
speculation, the following factors contribute to the benefit/dynamic cost of data
speculation:
• The probability that an intervening store will interfere with an advanced load.
• The cost of recovering from a failed advanced load.
• The specific microarchitectural implementation of the ALAT: its size, associativity,
and matching algorithm.
Determining interference probabilities can be difficult, but dynamic memory profiling
can help to predict how often ambiguous loads and stores will conflict.
When using advanced loads, there should be case-by-case consideration as to whether
advancing only a load and using a
ld.c
might be preferable to advancing both a load
and its uses, which would require the use of the potentially more expensive
chk.a
.
Even when recovery code is not executed, its presence extends the lifetimes of
registers used in data and control speculation, thus increasing register pressure and
possibly the cost of register movement by the Register Stack Engine (RSE). See
for information on considerations for recovery code placement.
Summary of Contents for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Page 1: ......
Page 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Page 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Page 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Page 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Page 230: ......
Page 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Page 380: ...2 132 Volume 2 Part 1 Interruptions ...
Page 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Page 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Page 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Page 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Page 808: ...2 560 Volume 2 Part 2 Context Management ...
Page 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Page 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Page 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Page 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Page 891: ......
Page 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Page 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Page 1296: ......
Page 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Page 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Page 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Page 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Page 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...