IA-32 Instruction Latency and Throughput
C
C-7
Table C-2
Streaming SIMD Extension 2 128-bit Integer Instructions
Instruction
Latency
1
Throughput
Execution
Unit
2
CPUID
0F3n
0F2n
0x69n
0F3n
0F2n
0x69n 0F2n
CVTDQ2PS
3
xmm, xmm
5
5
2
2
FP_ADD
CVTPS2DQ
3
xmm, xmm
5
5
3+1
2
2
2
FP_ADD
CVTTPS2DQ
3
xmm, xmm
5
5
3+1
2
2
2
FP_ADD
MOVD xmm, r32
6
6
1
2
2
2
MMX_MISC,
MMX_SHFT
MOVD r32, xmm
10
10
1+1
1
1
2
FP_MOVE,
FP_MISC
MOVDQA xmm, xmm
6
6
1
1
1
1
FP_MOVE
MOVDQU xmm, xmm
6
6
1
1
1
1
FP_MOVE
MOVDQ2Q mm, xmm
8
8
1
2
2
1
FP_MOVE,
MMX_ALU
MOVQ2DQ xmm, mm
8
8
1
2
2
1
FP_MOVE,
MMX_SHFT
MOVQ xmm, xmm
2
2
1
2
2
1
MMX_SHFT
PACKSSWB/PACKSSDW/
PACKUSWB xmm, xmm
4
4
2+1
2
2
2
MMX_SHFT
PADDB/PADDW/PADDD
xmm, xmm
2
2
1
2
2
1
MMX_ALU
PADDSB/PADDSW/
PADDUSB/PADDUSW
xmm, xmm
2
2
1
2
2
1
MMX_ALU
PADDQ mm, mm
2
2
2
1
1
1
FP_MISC
PSUBQ mm, mm
2
2
2+1
1
1
2
FP_MISC
PADDQ/ PSUBQ
3
xmm,
xmm
6
6
2+1
2
2
2
FP_MISC
PAND xmm, xmm
2
2
1
2
2
1
MMX_ALU
PANDN xmm, xmm
2
2
1
2
2
1
MMX_ALU
PAVGB/PAVGW xmm, xmm
2
2
2
2
MMX_ALU
PCMPEQB/PCMPEQD/
PCMPEQW xmm, xmm
2
2
1
2
2
1
MMX_ALU
continued
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...