Optimizing for SIMD Integer Applications
4
4-31
The subtraction operation presented above is an absolute difference, that
is,
t = abs(x-y
)
. The byte values are stored in temporary space, all
values are summed together, and the result is written into the lower
word of the destination register.
Packed Average (Byte/Word)
The
pavgb
and
pavgw
instructions add the unsigned data elements of the
source operand to the unsigned data elements of the destination register,
along with a carry-in. The results of the addition are then each
independently shifted to the right by one bit position. The high order
bits of each element are filled with the carry bits of the corresponding
sum.
The destination operand is an SIMD register. The source operand can
either be an SIMD register or a memory operand.
Figure 4-9
PSADBW
Instruction Example
O M15167
M M /m 64
X8
X7
X6
X5
X4
X3
X2
X1
0
63
M M
Y8
Y7
Y6
Y5
Y4
Y3
Y2
Y1
0
63
T em p
T8
T7
T6
T5
T4
T3
T2
T1
0
63
=
=
=
=
=
=
=
=
-
-
-
-
-
-
-
-
M M
0..0
0..0
0..0
T1+T2+T3+T4+T5+T6+T7+T8
0
63
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...