IA-32 Intel® Architecture Optimization
2-48
and cross-modifying code (when more than one processor in a
multi-processor system are writing to a code page) should be avoided
when high performance is desired.
Software should avoid writing to a code page in the same 1 KB subpage
of that is being executed or fetching code in the same 2 KB subpage of
that is currently being written. In addition, sharing a page containing
directly or speculatively executed code with another processor as a data
page can trigger an SMC condition that causes the entire pipeline of the
machine and the trace cache to be cleared. This is due to the
self-modifying code condition.
Dynamic code need not cause the SMC condition if the code written
fills up a data page before that page is accessed as code.
Dynamically-modified code (for example, from target fix-ups) is likely
to suffer from the SMC condition and should be avoided where possible.
Avoid the condition by introducing indirect branches and using data
tables on data (not code) pages via register-indirect calls.
Write Combining
Write combining (WC) improves performance in two ways:
•
On a write miss to the first-level cache, it allows multiple stores to
the same cache line to occur before that cache line is read for
ownership (RFO) from further out in the cache/memory hierarchy.
Then the rest of line is read, and the bytes that have not been written
are combined with the unmodified bytes in the returned line.
•
Write combining allows multiple writes to be assembled and written
further out in the cache hierarchy as a unit. This saves port and bus
traffic. Saving traffic is particularly important for avoiding partial
writes to uncached memory.
There are six write-combining buffers (on Pentium 4 and Intel Xeon
processors with CPUID signature of family encoding 15, model
encoding 3, there are 8 write-combining buffers). Two of these buffers
may be written out to higher cache levels and freed up for use on other
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...