IA-32 Intel® Architecture Optimization
6-14
In case the region is not mapped as
WC
, the streaming might update
in-place in the cache and a subsequent
sfence
would not result in the
data being written to system memory. Explicitly mapping the region as
WC
in this case ensures that any data read from this region will not be
placed in the processor’s caches. A read of this memory location by a
non-coherent I/O device would return incorrect/out-of-date results. For
a processor which solely implements approach (b), page 11, above, a
streaming store can be used in this non-coherent domain without
requiring the memory region to also be mapped as
WB
, since any cached
data will be flushed to memory by the streaming store.
Streaming Store Instruction Descriptions
The
movntq/movntdq
(non-temporal store of packed integer in an
MMX technology or Streaming SIMD Extensions register) instructions
store data from a register to memory. The instruction is implicitly
weakly-ordered, does no write-allocate, and so minimizes cache
pollution.
The
movntps
(non-temporal store of packed single precision floating
point) instruction is similar to
movntq
. It stores data from a Streaming
SIMD Extensions register to memory in 16-byte granularity. Unlike
movntq
, the memory address must be aligned to a 16-byte boundary or a
general protection exception will occur. The instruction is implicitly
weakly-ordered, does not write-allocate, and thus minimizes cache
pollution.
CAUTION.
Failure to map the region as
WC
may allow
the line to be speculatively read into the processor
caches, that is, via the wrong path of a mispredicted
branch.
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...