IA-32 Intel® Architecture Optimization
5-14
Data Deswizzling
In the deswizzle operation, we want to arrange the SoA format back into
AoS format so the
xxxx
,
yyyy
,
zzzz
are rearranged and stored in
memory as
xyz
. To do this we can use the
unpcklps
/
unpckhps
instructions to regenerate the
xyxy
layout and then store each half (
xy
)
into its corresponding memory location using
movlps
/
movhps
followed
by another
movlps
/
movhps
to store the
z
component.
Example 5-5 illustrates the deswizzle function:
Example 5-5
Deswizzling Single-Precision SIMD Data
void deswizzle_asm(Vertex_soa *in, Vertex_aos *out)
{
__asm {
mov ecx, in
// load structure addresses
mov edx, out
movaps xmm7, [ecx]
// load x1 x2 x3 x4 => xmm7
movaps xmm6, [ecx+16]
// load y1 y2 y3 y4 => xmm6
movaps xmm5, [ecx+32]
// load z1 z2 z3 z4 => xmm5
movaps xmm4, [ecx+48]
// load w1 w2 w3 w4 => xmm4
// START THE DESWIZZLING HERE
movaps xmm0, xmm7
// xmm0= x1 x2 x3 x4
unpcklps xmm7, xmm6
// xmm7= x1 y1 x2 y2
movlps [edx], xmm7
// v1 = x1 y1 -- --
movhps [edx+16], xmm7
// v2 = x2 y2 -- --
unpckhps xmm0, xmm6
// xmm0= x3 y3 x4 y4
movlps [edx+32], xmm0
// v3 = x3 y3 -- --
movhps [edx+48], xmm0
// v4 = x4 y4 -- --
movaps xmm0, xmm5
// xmm0= z1 z2 z3 z4
continued
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...