
xv
Examples
Assembly Code with an Unpredictable Branch ............................. 2-17
Code Optimization to Eliminate Branches ..................................... 2-17
Eliminating Branch with CMOV Instruction.................................... 2-18
pause
Instruction ............................................................... 2-19
Pentium 4 Processor Static Branch Prediction Algorithm.............. 2-20
Static Taken Prediction Example ................................................... 2-21
Static Not-Taken Prediction Example ............................................ 2-21
Indirect Branch With Two Favored Targets .................................... 2-25
A Peeling Technique to Reduce Indirect Branch Misprediction ..... 2-26
Loop Unrolling ............................................................................... 2-28
Code That Causes Cache Line Split ............................................. 2-31
Several Situations of Small Loads After Large Store .................... 2-35
A Non-forwarding Situation in Compiler Generated Code............. 2-36
A Non-forwarding Example of Large Load After Small Store ........ 2-36
Large and Small Load Stalls ......................................................... 2-37
An Example of Loop-carried Dependence Chain .......................... 2-39
Rearranging a Data Structure ....................................................... 2-39
Decomposing an Array .................................................................. 2-40
Dynamic Stack Alignment ............................................................. 2-43
Non-temporal Stores and 64-byte Bus Write Transactions............ 2-54
Non-temporal Stores and Partial Bus Write Transactions ............. 2-54
Algorithm to Avoid Changing the Rounding Mode......................... 2-66
Dependencies Caused by Referencing Partial Registers.............. 2-77
Recombining LOAD/OP Code into REG,MEM Form..................... 2-91
Spill Scheduling Example Code .................................................... 2-92
Identification of MMX Technology with cpuid................................... 3-3
Identification of SSE by the OS ....................................................... 3-4
Identification of SSE with cpuid ....................................................... 3-4
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...