1:64
Volume 1, Part 1: Application Programming Model
4.4.5.1
Data Speculation Concepts
An ambiguous memory dependency is said to exist between a store (or any operation
that may update memory state) and a load when it cannot be statically determined
whether the load and store might access overlapping regions of memory. For
convenience, a store that cannot be statically disambiguated relative to a particular
load is said to be ambiguous relative to that load. In such cases, the compiler cannot
change the order in which the load and store instructions were originally specified in the
program. To overcome this scheduling limitation, a special kind of load instruction
called an advanced load can be scheduled to execute earlier than one or more stores
that are ambiguous relative to that load.
As with control speculation, the compiler can also speculate operations that are
dependent upon the advanced load and later insert a check instruction that will
determine whether the speculation was successful or not. For data speculation, the
check can be placed anywhere the original non-data speculative load could have been
scheduled.
Thus, a data-speculative sequence of instructions consists of an advanced load, zero or
more instructions dependent on the value of that load, and a check instruction. This
means that any sequence of stores followed by a load can be transformed into an
advanced load followed by a sequence of stores followed by a check. The decision to
perform such a transformation is highly dependent upon the likelihood and cost of
recovering from an unsuccessful data speculation.
4.4.5.2
Data Speculation and Instructions
Advanced loads are available in integer (
ld.a
), floating-point (
ldf.a
), and
floating-point pair (
ldfp.a
) forms. When an advanced load is executed, it allocates an
entry in a structure called the Advanced Load Address Table (ALAT). Later, when a
corresponding check instruction is executed, the presence of an entry indicates that the
data speculation succeeded; otherwise, the speculation failed and one of two kinds of
compiler-generated recovery is performed:
1. The check load instruction (
ld.c
,
ldf.c
, or
ldfp.c
) is used for recovery when
the only instruction scheduled before a store that is ambiguous relative to the
advanced load is the advanced load itself. The check load searches the ALAT for a
matching entry. If found, the speculation was successful; if a matching entry was
not found, the speculation was unsuccessful and the check load reloads the
correct value from memory.
shows this transformation.
2. The advanced load check (
chk.a
) is used when an advanced load and several
instructions that depend on the loaded value are scheduled before a store that is
ambiguous relative to the advanced load. The advanced load check works like the
Figure 4-2.
Data Speculation Recovery Using ld.c
Before Data Speculation
After Data Speculation
// Other instructions
st8
[r4] = r12
ld8
r6 = [r8];;
add
r5 = r6, r7;;
st8
[r18] = r5
ld8.a
r6 = [r8];; // Advanced load
// Other instructions
st8
[r4] = r12
ld8.c.clr
r6 = [r8] // Check load
add
r5 = r6, r7;;
st8
[r18] = r5
Summary of Contents for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Page 1: ......
Page 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Page 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Page 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Page 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Page 230: ......
Page 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Page 380: ...2 132 Volume 2 Part 1 Interruptions ...
Page 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Page 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Page 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Page 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Page 808: ...2 560 Volume 2 Part 2 Context Management ...
Page 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Page 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Page 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Page 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Page 891: ......
Page 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Page 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Page 1296: ......
Page 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Page 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Page 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Page 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Page 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...