About AMD CodeAnalyst Performance
Analyzer
The AMD CodeAnalyst Performance
Analyzer is a suite of powerful tools that analyzes software performance on AMD
microprocessors. These tools are designed to support Microsoft® W 16216b12q indows XP®,
Windows 2003 and Vista® distribution on x86 and AMD64 architectures. Although
most users will choose the Graphical User Interface, the profiler is also
offered as a command line utility to facilitate the use in batch files.
- System-Wide Profiling : CodeAnalyst is designed to
profile the performance of binary modules, including user mode application
modules and kernel mode driver modules. Timer-Based Profiling and
Event-Based Profiling collect data from multiple processors in a
multi-processor system.
- Timer-Based Profiling (TBP)
- The application to be optimized is run at full speed
on the system that is running CodeAnalyst. EIP
samples are collected at predetermined intervals and can be used to
identify possible bottlenecks, execution penalties, or optimization
opportunities.
- On APIC enabled systems, the finest time resolution is
0.1ms and 1.0ms non-APIC enabled systems.
- Event-Based Profiling (EBP) : CodeAnalyst EBP is designed
to profile the hardware performance events on AMD AthlonT,
AMD AthlonT XP, AMD OpteronT,
AMD AthlonT 64 and AMD "Barcelona" (AMD Family 10h). With event
multiplexing technique, CodeAnalyst EBP is able
to profile more than 4 events simultaneously.
- Instruction-Based Sampling (IBS) : Instruction-based Sampling is a new performance
measurement technique supported by AMD Barcelona (Family 10h) processors.
IBS has these advantages:
- IBS precisely associates hardware event information
with the instructions that cause the events. A data cache miss, for
example, is associated with the AMD64 instruction performing the memory
read or write operation that caused the miss.
- IBS collects a wide range of hardware event
information in a single measurement run.
- IBS collects new information such as retire delay and
data cache miss latency.
- Call Stack Sampling (CSS) : Combining with TBP or EBP, Call Stack Sampling is able
to collect data on caller-callee relationship on
the hotspots.
- Pipeline Simulation : Used during the second stage of an optimization effort
to find the causes of bottlenecks. During simulation, application execution
is first traced, and then simulated on a selected target processor. The
detailed data on the execution of each instruction takes into account the
previous instructions executed and the state of the processor caches.
Simulation only supports single processor execution.
Pipeline
Simulation supports the simulation of 32-bit code on:
- AMD AthlonT XP processor
- AMD OpteronT processor
- AMD AthlonT 64 processor
Pipeline
Simulation also supports the simulation of 64-bit code on:
- AMD OpteronT processor
- AMD AthlonT 64 processor
- Thread Profile : CodeAnalyst thread profiling views show the thread
chart and non-local memory access.
- Post Process : CodeAnalyst shows sample distribution without module
debug information.
- Interpret performance measurements rather than display
raw performance data
- Flexible view configuration and management