Optimizing Cache Usage
6
6-7
The
prefetch
instruction is implementation-specific; applications need
to be tuned to each implementation to maximize performance.
The
prefetch
instructions merely provide a hint to the hardware, and
they will not generate exceptions or faults except for a few special cases
(see the “Prefetch and Load Instructions” section). However, excessive
use of prefetch instructions may waste memory bandwidth and result in
performance penalty due to resource constraints.
Nevertheless, the prefetch instructions can lessen the overhead of
memory transactions by preventing cache pollution and by using the
caches and memory efficiently. This is particularly important for
applications that share critical system resources, such as the memory
bus. See an example in the “Video Encoder” section.
The
prefetch
instructions are mainly designed to improve application
performance by hiding memory latency in the background. If segments
of an application access data in a predictable manner, for example, using
arrays with known strides, then they are good candidates for using
prefetch to improve performance.
Use the
prefetch
instructions in:
•
predictable memory access patterns
•
time-consuming innermost loops
•
locations where the execution pipeline may stall if data is not
available
NOTE.
Using the
prefetch
instructions is
recommended only if data does not fit in cache.
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...