Optimizing for SIMD Integer Applications
4
4-3
Using SIMD Integer with x87 Floating-point
All 64-bit SIMD integer instructions use the MMX registers, which
share register state with the x87 floating-point stack. Because of this
sharing, certain rules and considerations apply. Instructions which use
the MMX registers cannot be freely intermixed with x87 floating-point
registers. Care must be taken when switching between using 64-bit
SIMD integer instructions and x87 floating-point instructions (see
“Using the EMMS Instruction” section below).
The SIMD floating-point operations and 128-bit SIMD integer
operations can be freely intermixed with either x87 floating-point
operations or 64-bit SIMD integer operations. The SIMD floating-point
operations and 128-bit SIMD integer operations use registers that are
unrelated to the x87 FP / MMX registers. The
emms
instruction is not
needed to transition to or from SIMD floating-point operations or
128-bit SIMD operations.
Using the EMMS Instruction
When generating 64-bit SIMD integer code, keep in mind that the eight
MMX registers are aliased on the x87 floating-point registers.
Switching from MMX instructions to x87 floating-point instructions
incurs a finite delay, so it is the best to minimize switching between
these instruction types. But when you need to switch, the
emms
instruction provides an efficient means to clear the x87 stack so that
subsequent x87 code can operate properly on the x87 stack.
As soon as any instruction makes reference to an MMX register, all
valid bits in the x87 floating-point tag word are set, which implies that
all x87 registers contain valid values. In order for software to operate
correctly, the x87 floating-point stack should be emptied when starting a
series of x87 floating-point calculations after operating on the MMX
registers.
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...