IA-32 Intel® Architecture Optimization
2-30
Assembly/Compiler Coding Rule 16. (H impact, H generality) Align data
on natural operand size address boundaries. If the data will be accesses with
vector instruction loads and stores, align the data on 16 byte boundaries.
For best performance, align data as follows:
•
Align 8-bit data at any address.
•
Align 16-bit data to be contained within an aligned four byte word.
•
Align 32-bit data so that its base address is a multiple of four.
•
Align 64-bit data so that its base address is a multiple of eight.
•
Align 80-bit data so that its base address is a multiple of sixteen.
•
Align 128-bit data so that its base address is a multiple of sixteen.
A 64-byte or greater data structure or array should be aligned so that its
base address is a multiple of 64. Sorting data in decreasing size order is
one heuristic for assisting with natural alignment. As long as 16-byte
boundaries (and cache lines) are never crossed, natural alignment is not
strictly necessary, though it is an easy way to enforce this.
Example 2-11 shows the type of code that can cause a cache line split.
The code loads the addresses of two
dword
arrays. 029e70feh is not a
4-byte-aligned address, so a 4-byte access at this address will get 2 bytes
from the cache line this address is contained in, and 2 bytes from the
cache line that starts at 029e7100h. On processors with 64-byte cache
lines, a similar cache line split will occur every 8 iterations. Figure 2-1
illustrates the situation of accessing a data element that span across
cache line boundaries.
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...