Volume 2, Part 2: MP Coherence and Synchronization
2:525
2.2.3
Understanding Other Ordering Models: Sequential
Consistency and IA-32
To provide a point of reference, it is helpful to understand other memory ordering
models. These ordering models affect not only the programmer’s view of the system,
but also the overall system performance and design. Processors with relaxed memory
ordering models may achieve higher performance than those with strict ordering
models.
The most intuitive memory ordering model is “sequential consistency” (SC) which
Lamport formally defines in [L79]. In sequential consistency, all processors see the
memory references from a given processor in program order, and, in addition, all
processors see the same system-wide interleaving of memory references from each
processor.
The SC model precludes many common optimizations made in modern microprocessors
to enhance performance. For example, in an SC system, a load may not pass a prior
store until that store becomes globally visible (because all memory operations must
become visible in program order). This requirement prevents the SC system from using
a store buffer to hide the latency of store traffic by allowing loads that hit the cache to
be serviced under a prior store that miss the cache.
To address such performance issues, many memory ordering models have been
developed that relax the constraints of sequential consistency. Adve categorizes these
memory models by noting how they relax the ordering requirements between reads
and writes and if they allow writes to be read early [AG95]. The Itanium architecture
allows for relaxed ordering between reads and writes and also allows writes to be read
early under certain circumstances.
Aside from disallowing any relaxation of memory references, sequential consistency has
two other subtle differences from the Itanium memory ordering model. First, it requires
a total order of operations whereas the Itanium memory ordering model only requires a
total order for release stores and semaphores. Second, remote processors must always
honor data dependencies since the local processor does not have the option of
re-ordering such accesses as can occur.
The IA-32 memory ordering relaxes write to read ordering and allows a processor to
read its own writes before they are globally visible. Further, IA-32 allows each
processor in the coherence domain to interleave the reference streams from other
processors in the coherence domain in a different order. The per-processor orders must
meet some additional constraints to ensure they are consistent with each other
(enumerating and explaining these constraints is beyond the scope of this document).
For more information on the IA-32 ordering model see
Summary of Contents for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS VOLUME 3 REV 2.3
Page 1: ......
Page 11: ...x Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 13: ...1 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 33: ...1 22 Volume 1 Part 1 Introduction to the Intel Itanium Architecture ...
Page 57: ...1 46 Volume 1 Part 1 Execution Environment ...
Page 147: ...1 136 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 149: ...1 138 Volume 1 Part 2 About the Optimization Guide ...
Page 191: ...1 180 Volume 1 Part 2 Predication Control Flow and Instruction Stream ...
Page 230: ......
Page 248: ...236 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 250: ...2 2 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 264: ...2 16 Volume 2 Part 1 Intel Itanium System Environment ...
Page 380: ...2 132 Volume 2 Part 1 Interruptions ...
Page 398: ...2 150 Volume 2 Part 1 Register Stack Engine ...
Page 486: ...2 238 Volume 2 Part 1 IA 32 Interruption Vector Descriptions ...
Page 750: ...2 502 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 754: ...2 506 Volume 2 Part 2 About the System Programmer s Guide ...
Page 796: ...2 548 Volume 2 Part 2 Interruptions and Serialization ...
Page 808: ...2 560 Volume 2 Part 2 Context Management ...
Page 842: ...2 594 Volume 2 Part 2 Floating point System Software ...
Page 850: ...2 602 Volume 2 Part 2 IA 32 Application Support ...
Page 862: ...2 614 Volume 2 Part 2 External Interrupt Architecture ...
Page 870: ...2 622 Volume 2 Part 2 Performance Monitoring Support ...
Page 891: ......
Page 1099: ...3 200 Volume 3 Instruction Reference padd Interruptions Illegal Operation fault ...
Page 1295: ...3 396 Volume 3 Resource and Dependency Semantics ...
Page 1296: ......
Page 1302: ...402 Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1494: ...4 192 Volume 4 Base IA 32 Instruction Reference FWAIT Wait See entry for WAIT ...
Page 1647: ...Volume 4 Base IA 32 Instruction Reference 4 345 ROL ROR Rotate See entry for RCL RCR ROL ROR ...
Page 1884: ...4 582 Volume 4 IA 32 SSE Instruction Reference ...
Page 1885: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 Index ...
Page 1886: ...Index Intel Itanium Architecture Software Developer s Manual Rev 2 3 ...
Page 1898: ...INDEX Index 12 Index for Volumes 1 2 3 and 4 ...