IA-32 Intel® Architecture Optimization
5-12
Example 5-4 shows the same data -swizzling algorithm encoded using
the Intel C++ Compiler’s intrinsics for SSE.
Example 5-4
Swizzling Data Using Intrinsics
//Intrinsics version of data swizzle
void swizzle_intrin (Vertex_aos *in, Vertex_soa *out, int stride)
{
__m128 x, y, z, w;
__m128 tmp;
x = _mm_loadl_pi(x,(__m64 *)(in));
x = _mm_loadh_pi(x,(__m64 *)( (char *)(in)));
y = _mm_loadl_pi(y,(__m64 *)(2*(char *)(in)));
y = _mm_loadh_pi(y,(__m64 *)(3*(char *)(in)));
tmp = _mm_shuffle_ps( x, y, _MM_SHUFFLE( 2, 0, 2, 0));
y = _mm_shuffle_ps( x, y, _MM_SHUFFLE( 3, 1, 3, 1));
x = tmp;
z = _mm_loadl_pi(z,(__m64 *)(8 + (char *)(in)));
z = _mm_loadh_pi(z,(__m64 *)(8+(char *)(in)));
w = _mm_loadl_pi(w,(__m64 *)(2*8+(char*)(in)));
w = _mm_loadh_pi(w,(__m64 *)(3*8+(char*)(in)));
tmp = _mm_shuffle_ps( z, w, _MM_SHUFFLE( 2, 0, 2, 0));
w = _mm_shuffle_ps( z, w, _MM_SHUFFLE( 3, 1, 3, 1));
z = tmp;
_mm_store_ps(&out->x[0], x);
_mm_store_ps(&out->y[0], y);
_mm_store_ps(&out->z[0], z);
_mm_store_ps(&out->w[0], w);
}
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...