
Optimizing for SIMD Integer Applications
4
4-11
Non-Interleaved Unpack
The unpack instructions perform an interleave merge of
the data
elements of the destination and source operands into the destination
register. The following example merges the two operands into the
destination registers without interleaving. For example, take two
adjacent elements of a packed-word data type in
source1
and place this
value in the low 32 bits of the results. Then take two adjacent elements
of a packed-word data type in
source2
and place this value in the high
32 bits of the results. One of the destination registers will have the
combination illustrated in Figure 4-3.
Example 4-5
Interleaved Pack without Saturation
; Input:
;
MM0
signed source value
;
MM1
signed source value
; Output:
;
MM0
the first and third words contain the
;
low 16-bits of the doublewords in MM0,
;
the second and fourth words contain the
;
low 16-bits of the doublewords in MM1
pslld
MM1, 16
; shift the 16 LSB from each of the
; doubleword values to the 16 MSB
; position
pand
MM0, {0,ffff,0,ffff}
; mask to zero the 16 MSB
; of each doubleword value
por
MM0, MM1
; merge the two operands
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...