Optimizing for SIMD Integer Applications
4
4-27
Highly Efficient Clipping
For clipping signed words to an arbitrary range, the
pmaxsw
and
pminsw
instructions may be used. For clipping unsigned bytes to an arbitrary
range, the
pmaxub
and
pminub
instructions may be used. Example 4-19
shows how to clip signed words to an arbitrary range; the code for
clipping unsigned bytes is similar.
Example 4-19 Clipping to a Signed Range of Words [high, low]
; Input:
;
MM0 signed
source
operands
; Output:
;
MM0
signed words clipped to the signed
;
range [high, low]
pminsw
MM0, packed_high
pmaxsw
MM0, packed_low
Example 4-20 Clipping to an Arbitrary Signed Range [high, low]
; Input:
;
MM0 signed
source
operands
; Output:
;
MM1
signed operands clipped to the unsigned
;
range [high, low]
paddw
MM0, packed_min
; add with no saturation
; 0x8000 to convert to unsigned
paddusw MM0, (packed_usmax - high_us)
; in effect this clips to high
psubusw MM0, (packed_usmax - h low_us)
; in effect this clips to low
paddw
MM0, packed_low
; undo the previous two offsets
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...