![Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3 Manual Download Page 764](http://html.mh-extra.com/html/intel/itanium-architecture-software-developers-volume-3-rev-2-3/itanium-architecture-software-developers-volume-3-rev-2-3_manual_2073404764.webp)
2:516
Volume 2, Part 2: MP Coherence and Synchronization
2.2.1.7
Data Dependency Establishes Local Ordering
In the Itanium architecture, a dependency (e.g., a later operation reading the value
written by an earlier operation) can imply a local ordering relationship between the two
operations. This section focuses on dependencies through registers only.
discusses dependencies and MP ordering.
The execution shown in
illustrates how data dependency and memory
ordering interact in a simple “pointer chase.”
In this example, Processor #0 could be executing code that updates a shared object
with M1 and then publishes a pointer to the object with M2. Processor #1 then loads
the pointer and dereferences it to read the contents of the shared object. The outcome
r1 = x and r2 = 0 implies that Processor #1 observes the new value of the object
pointer, y, but the old value of the data field, x.
The ordering semantics require
but place no requirements on the relative
ordering of M3 and M4.
Thus, the memory semantics alone would allow the outcome r1 = x and r2 = 0 in the
absence of other constraints. Using an acquire load for M3 can avoid this outcome as
doing so forces
and thus prevents the outcome. However, this use of acquire
is non-intuitive given the RAW dependency through register r1 between M3 and M4.
That is, M3 produces a value that M4 requires in order to execute so how should it be
possible for them to go out of order? Further, using an acquire in this case prevents any
memory operation following M3 from moving above M3, even if they are completely
independent of M3.
To avoid this potential confusion and performance issue, the Itanium architecture treats
data dependency and memory ordering in the same fashion on the local processor. That
is, if
and A produces a value that B consumes, then
on the local processor.
This relationship is also transitive as the execution in
illustrates.
The Processor #0 code is the same as in
. The Processor #1 now performs the
following operation: if the pointer value y is equal to x, load a value from x.
Table 2-6.
Memory Ordering and Data Dependency
Processor #0
Processor #1
st
[x] = 1
// M1
st.rel
[y] = x
// M2
ld
r1 = [y] ;;
// M3
ld
r2 = [r1]
// M4
Outcome:
r1 = x and r2 = 0 is not allowed
Table 2-7.
Memory Ordering and Data Dependency Through a Predicate
Register
Processor #0
Processor #1
st
[x] = 1
// M1
st.rel
[y] = x
// M2
ld
r1 = [y]
// M3
cmp.eq p1, p2 = r1, x ;; // C1
(p1)ld
r2 = [x]
// M4
Outcome:
r1 = x and r2 = 0 is not allowed
M1
M2
M3
M4
A B
»
A
B
Summary of Contents for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Page 1: ......
Page 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Page 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Page 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Page 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Page 230: ......
Page 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Page 380: ...2 132 Volume 2 Part 1 Interruptions ...
Page 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Page 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Page 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Page 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Page 808: ...2 560 Volume 2 Part 2 Context Management ...
Page 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Page 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Page 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Page 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Page 891: ......
Page 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Page 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Page 1296: ......
Page 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Page 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Page 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Page 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Page 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...