2:520
Volume 2, Part 2: MP Coherence and Synchronization
, the discussion in this section focuses on the outcome r1 = 1, r3 =
1, r2 = 0, and r4 = 0 because it is allowed if and only if store buffers can satisfy local
loads. The line of reasoning to show that the outcome r1 = 1, r3 = 1, r2 = 0, and r4 =
0 is not allowed in
is similar to the reasoning used to show that this outcome
is allowed in the
.
By the definition of the Itanium memory ordering semantics,
By allowing local and global visibility of operations M1 and M5 (similar to the discussion
in
), this assumption, along with the above constraints, together imply
that,
Consider these constraints on the Processor #0 operations m1, M1, M2, M3, and M4.
Making m1 visible before M2, M3, and M4 correctly honors the data dependency
through memory on Processor #0. However, unless it constrains the global visibility of
M1 to occur before M2, M3, and M4, Processor #0 violates the Itanium ordering
semantics. Specifically, the memory fence M2 must always be made visible after the
store M1. Allowing global and local visibilities of M1 in this case violates this constraint,
and thus, is not allowed. Essentially, by allowing M1 to become locally visible early, M3
would see M1 before the fence semantics for M2 were met (namely, that M1 be visible
before M2 and thus M3). Without local and global visibility of M1 and M5, the ordering
constraints are as this example originally postulated.
The code in
and these constraints together imply that
This contradicts the r1 = 1, r3 = 1, r2 = 0, and r4 = 0 outcome. The visibility of the
memory fence, M2, implies that all prior operations including the store to x, M1, are
globally visible. Thus, the load from x on Processor #1, M8, must observe the new
value of x and
but the outcome requires
2.2.1.10
Semaphores Do Not Locally Bypass
and
discuss, loads and acquire loads may be
satisfied with values placed in local store buffers (or other logically-equivalent
structures) by stores or release stores before the stored data becomes visible to other
agents in the coherence domain. The Itanium architecture explicitly prohibits such local
bypass either to or from semaphore operations. That is, semaphore operations cannot
be satisfied in this way nor can the data they store be used to satisfy loads or acquire
loads in this way.
The execution in
illustrates a variation on the execution in
where
the acquire loads have been replaced with exchange semaphore operations (which also
have acquire semantics).
M1
M2
M3
M4
M5
M6
M7
M8
m1
M1
m1
M2
M3
M4
m5
M5
m5
M6
M7
M8
r2 = 0
M4
M5
M1
M8 because M1
M4 and M4
M5 and M5
M8
r4 = 1
M1
M8
M8
M1.
Summary of Contents for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Page 1: ......
Page 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Page 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Page 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Page 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Page 230: ......
Page 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Page 380: ...2 132 Volume 2 Part 1 Interruptions ...
Page 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Page 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Page 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Page 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Page 808: ...2 560 Volume 2 Part 2 Context Management ...
Page 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Page 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Page 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Page 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Page 891: ......
Page 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Page 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Page 1296: ......
Page 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Page 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Page 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Page 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Page 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...