Using Performance Monitoring Events
B
B-5
The first two metrics use performance counters, and thus can be used to
cause interrupt upon overflow for sampling. They may also be useful
for those cases where it is easier for a tool to read a performance counter
instead of the time stamp counter. The timestamp counter is accessed
via an instruction, RDTSC.
For applications with a significant amount of I/O, there may be two
ratios of interest:
•
Non-halted CPI
: non-halted clockticks/instructions retired
measures the CPI for the phases where the CPU was being used.
This ratio can be measured on a per- logical-processor basis, when
Hyper-Threading Technology is enabled.
•
Nominal CPI
: timestamp counter ticks/instructions retired
measures the CPI over the entire duration of the program, including
those periods the machine is halted while waiting for I/O.
The distinction between these two CPI is important for processors that
support Hyper-Threading Technology. Non-halted CPI should use the
“Non-Halted clockticks” performance metric as the numerator. Nominal
CPI can use “Non-Sleep clockticks” in the numerator. “Non-sleep
clockticks” is the same as the “clockticks” metric in previous editions of
this manual.
Non-Halted Clockticks
Non-halted clockticks can be obtained by programming the appropriate
ESCR and CCCR following the recipe listed in the general metrics
category in Table B-1. Additionally, the desired
T0_OS/T0_USR/T1_OS/T1_USR bits may be specified to qualify a
specific logical processor and/or kernel vs. user mode.
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...