IA-32 Intel® Architecture Optimization
2-38
Store-forwarding Restriction on Data Availability
The value to be stored must be available before the load operation can
be completed. If this restriction is violated, the execution of the load will
be delayed until the data is available. This delay causes some execution
resources to be used unnecessarily, and that can lead to sizable but
non-deterministic delays. However, the overall impact of this problem is
much smaller than that from size and alignment requirement violations.
The Pentium 4 and Intel Xeon processors predict when loads are both
dependent on and get their data forwarded from preceding stores. These
predictions can significantly improve performance. However, if a load is
scheduled too soon after the store it depends on or if the generation of
the data to be stored is delayed, there can be a significant penalty.
There are several cases where data is passed through memory, where the
store may need to be separated from the load:
•
spills, save and restore registers in a stack frame
•
parameter passing
•
global and volatile variables
•
type conversion between integer and floating point
•
when compilers do not analyze code that is inlined, forcing
variables that are involved in the interface with inlined code to be in
memory, creating more memory variables and preventing the
elimination of redundant loads
Assembly/Compiler Coding Rule 22. (H impact, MH generality) Where it
is possible to do so without incurring other penalties, prioritize the allocation
of variables to registers, as in register allocation and for parameter passing to
minimize the likelihood and impact of store- forwarding problems. Try not to
store-forward data generated from a long latency instruction, e.g.
mul,
div
.
Avoid store-forwarding data for variables with the shortest store-load distance.
Avoid store-forwarding data for variables with many and/or long dependence
chains, and especially avoid including a store forward on a loop-carried
dependence chain.
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...