x
Hardware Prefetch ......................................................................................................... 6-19
Example of Effective Latency Reduction with H/W Prefetch .......................................... 6-20
Example of Latency Hiding with S/W Prefetch Instruction ............................................ 6-22
Software Prefetching Usage Checklist ........................................................................... 6-24
Software Prefetch Scheduling Distance ......................................................................... 6-25
Software Prefetch Concatenation................................................................................... 6-26
Minimize Number of Software Prefetches ...................................................................... 6-29
Mix Software Prefetch with Computation Instructions .................................................... 6-32
Software Prefetch and Cache Blocking Techniques....................................................... 6-34
Hardware Prefetching and Cache Blocking Techniques ................................................ 6-39
Single-pass versus Multi-pass Execution ....................................................................... 6-41
Non-temporal Stores and Software Write-Combining..................................................... 6-43
Cache Management ....................................................................................................... 6-44
Video Encoder .......................................................................................................... 6-45
Video Decoder .......................................................................................................... 6-45
Conclusions from Video Encoder and Decoder Implementation .............................. 6-46
Optimizing Memory Copy Routines .......................................................................... 6-46
TLB Priming .............................................................................................................. 6-47
Using the 8-byte Streaming Stores and Software Prefetch....................................... 6-48
Using 16-byte Streaming Stores and Hardware Prefetch ......................................... 6-50
Performance Comparisons of Memory Copy Routines ............................................ 6-52
Cache Sharing Using Deterministic Cache Parameters................................................. 6-55
Cache Sharing in Single-core or Multi-core.................................................................... 6-55
Determine Prefetch Stride Using Deterministic Cache Parameters ............................... 6-56
Multi-Core and Hyper-Threading Technology
Multithreading ................................................................................................................... 7-2
Multitasking Environment ................................................................................................. 7-4
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...