General Optimization Guidelines
2
2-31
Alignment of code is less of an issue for the Pentium 4 processor.
Alignment of branch targets to maximize bandwidth of fetching cached
instructions is an issue only when not executing out of the trace cache.
Alignment of code can be an issue for the Pentium M processor, and
alignment of branch targets will improve decoder throughput.
Example 2-11 Code That Causes Cache Line Split
mov
esi, 029e70feh
mov
edi, 05be5260h
Blockmove:
mov
eax, DWORD PTR [esi]
mov
ebx, DWORD PTR [esi+4]
mov
DWORD PTR [edi], eax
mov
DWORD PTR [edi+4], ebx
add
esi, 8
add
edi, 8
sub
edx, 1
jnz
Blockmove
Figure 2-1
Cache Line Split in Accessing Elements in a Array
Index 1
Index 0 cont'd
Index 0
Index 15
Index 16
Line 029e7100h
Line 029e70c0h
Index 17
Index 16 cont'd
Index 31
Index 32
Line 029e7140h
Address 029e70feh
Address 029e70c1h
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...