IA-32 Intel® Architecture Optimization
4-2
For planning considerations of using the new SIMD integer instructions,
refer to “Checking for Streaming SIMD Extensions 2 Support” in
Chapter 3.
General Rules on SIMD Integer Code
The overall rules and suggestions are as follows:
•
Do not intermix 64-bit SIMD integer instructions with x87
floating-point instructions. See “Using SIMD Integer with x87
Floating-point” section in this chapter. Note that all of the SIMD
integer instructions can be intermixed without penalty.
•
When writing SSE2 code that works with both integer and
floating-point data, use the subset of SIMD convert instructions or
load/store instructions to ensure that the input operands in XMM
registers contain properly defined data type to match the instruction.
Code sequences containing cross-typed usage will produce the same
result across different implementations, but will incur a significant
performance penalty. Using SSE or SSE2 instructions to operate on
type-mismatched SIMD data in the XMM register is strongly
discouraged.
•
Use the optimization rules and guidelines described in Chapter 2
and Chapter 3 that apply to the Pentium 4, Intel Xeon and
Pentium M processors.
•
Take advantage of hardware prefetcher where possible. Use prefetch
instruction only when data access patterns are irregular and prefetch
distance can be pre-determined. (for details, refer to Chapter 6,
“Optimizing Cache Usage”).
•
Emulate conditional moves by using masked compares and logicals
instead of using conditional branches.
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...