IA-32 Intel® Architecture Optimization
1-22
•
avoids the need to access off-chip caches, which can increase the
realized bandwidth compared to a normal load-miss, which returns
data to all cache levels
Situations that are less likely to benefit from software prefetch are:
•
for cases that are already bandwidth bound, prefetching tends to
increase bandwidth demands
•
prefetching far ahead can cause eviction of cached data from the
caches prior to the data being used in execution
•
not prefetching far enough can reduce the ability to overlap memory
and execution latencies
Software prefetches are treated by the processor as a hint to initiate a
request to fetch data from the memory system, and consume resources
in the processor and the use of too many prefetches can limit their
effectiveness. Examples of this include prefetching data in a loop for a
reference outside the loop and prefetching in a basic block that is
frequently executed, but which seldom precedes the reference for which
the prefetch is targeted.
See also: Chapter 6, “Optimizing Cache Usage.”
Automatic hardware prefetch
is a feature in the Pentium 4 processor.
It brings cache lines into the unified second-level cache based on prior
reference patterns. See also: Chapter 6, “Optimizing Cache Usage.”
Pros and Cons of Software and Hardware Prefetching.
Software
prefetching has the following characteristics:
•
handles irregular access patterns, which would not trigger the
hardware prefetcher
•
handles prefetching of short arrays and avoids hardware prefetching
start-up delay before initiating the fetches
•
must be added to new code; so it does not benefit existing
applications
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...