Optimizing Cache Usage
6
6-11
•
Reduce disturbance of frequently used cached (temporal) data, since
they write around the processor caches.
Streaming stores allow cross-aliasing of memory types for a given
memory region. For instance, a region may be mapped as write-back
(
WB
) via the page attribute tables (
PAT
) or memory type range registers
(
MTRR
s) and yet is written using a streaming store.
Memory Type and Non-temporal Stores
The memory type can take precedence over the non-temporal hint,
leading to the following considerations:
•
If the programmer specifies a non-temporal store to
strongly-ordered uncacheable memory, for example, the
Uncacheable (UC) or Write-Protect (WP) memory types, then the
store behaves like an uncacheable store; the non-temporal hint is
ignored and the memory type for the region is retained.
•
If the programmer specifies the weakly-ordered uncacheable
memory type of Write-Combining (WC), then the non-temporal
store and the region have the same semantics, and there is no
conflict.
•
If the programmer specifies a non-temporal store to cacheable
memory, for example, Write-Back (
WB
) or Write-Through (
WT
)
memory types, two cases may result:
1.
If the data is present in the cache hierarchy, the instruction will
ensure consistency. A particular processor may choose different
ways to implement this. The following approaches are probable: (a)
updating data in-place in the cache hierarchy while preserving the
memory type semantics assigned to that region or (b) evicting the
data from the caches and writing the new non-temporal data to
memory (with
WC
semantics).
Note that the approaches (separate or combined) can be
different for future processors. The Pentium 4, Intel Core Solo
and Intel Core Duo processors implement the latter policy (of
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...