Releases: intel/PerfSpect
v3.12.0
v3.12.0 is a feature and maintenance release
What's New
- Added network interface coalesce settings to report via ethtool -c by @Copilot in #518
- Enhanced NIC table field descriptions for better clarity in report by @harp-intel in #522
- Added Card/Port column to NIC table for physical card mapping in report by @Copilot in #524
- Added virtual function detection and annotation to NIC table in report by @Copilot in #525
- Added recognition of Diamond Rapids (DMR) CPU, refactor to handle multiple Intel families by @harp-intel in #526
- Updated processwatch to the latest version and adjusted instruction mix telemetry to show all instruction categories by @harp-intel in #538
- Introduced PDU telemetry as a hidden telemetry option, enabled if PERFSPECT_PDU_HOST, PERFSPECT_PDU_USER, PERFSPECT_PDU_PASSWORD, and PERFSPECT_PDU_OUTLET environment variables are set by @harp-intel in #538
What's Changed
- Gaudi telemetry now optional. Enable if PERFSPECT_GAUDI_HLSMI_PATH environment variable is set by @harp-intel in #538
- Instruction Mix telemetry category filter feature removed by @harp-intel in #538
What's Fixed
- fix: kernel utilization metrics on EC2 AL2023 w/ 6.1 kernel by @harp-intel in #515
- fix: make component loader event group formation deterministic for post-processing by @harp-intel in #517
- fix parsing of dmesg line to retrieve # of ARM counters by @harp-intel in #529
- fix frequency benchmark on some ICX systems by @harp-intel in #532
- fix lscpu parsing for older versions of lscpu by @harp-intel in #533
- fix: handle empty model names in NIC summary output by @harp-intel in #539
- fix: don't check for PMUs in use if noroot flag given by @harp-intel in #541
- fix: ignore metrics that use ref-cycles when ref-cycles not supported by @harp-intel in #542
- fix: pad core frequencies to length of frequency buckets by @harp-intel in #548
Full Changelog: v3.11.0...v3.12.0
v3.11.0
v3.11.0 is a feature and maintenance release
What's New
- add support for reporting metrics on GCP's Axion systems and AWS's Graviton systems
- add support for reporting metrics on EC2 m8a (Turin)
- add support for field descriptions in both HTML and Excel reports. Field descriptions are displayed as tooltips in HTML tables and as cell comments in Excel exports. Descriptions added for cache sizes and CPU frequencies.
- improve cache sizes reporting. L1 and L2 are reported per Core and L3 is reported per Instance and System Total
- improve metrics HTML report by showing all TMA metrics on TMAM tab
What's Fixed
- fix regression where power and c6 residency not reported in metrics
- fix telemetry power stats where not reported on some systems due to turbostat output formatting differences
- fix error in kernel utilization percentage metric formula that resulted in an elevated value for the metric
Full Changelog: v3.10.0...v3.11.0
v3.10.0
v3.10.0 is a feature and maintenance release
What's New
- the 'All Metrics' tab in the metrics command's HTML report now includes definitions for every metric, highlights metrics that exceed a threshold, and provides context and/or a tip when metric is highlighted.
- the report command's Gaudi table now includes the Gaudi microarchitecture (Gaudi 1/2/3)
- The metrics command now produces a system-level HTML summary report when data is collected with granularity set to socket or cpu and when scope is set to cgroup. This is in addition to the HTML summary report already produced at system granularity.
What's Fixed
- report command's JSON output format now presents an empty data set as empty list '[]' instead of a record with empty values
- metrics command fixed on RHEL-9
Full Changelog: v3.9.1...v3.10.0
v3.9.1
v3.9.1 is a maintenance release, bug fixes only.
Issues Addressed:
#460 - some telemetry categories not reported if system is configured for 12 hour time format
#463 - perf: Argument list too long
#466 - metrics with --cpus option sometimes errors
Full Changelog: v3.9.0...v3.9.1
v3.9.0
What's Changed
New Features, Changes, and Enhancements
- added support for collecting
metricson a specific set of cpus with the new--cpusflag metricsmulti-unit, e.g., cgroup, summary CSV reformatted for easier parsing- add flag for instruction mix frequency (--instrmix-frequency) in
telemetrycommand and lower default setting to decrease default overhead - support for cri-containerd in
metricscommand cgroup scope - update memory benchmarks for more accuracy on systems with larger L3 cache
- simplify time format, add system type, and use AMD specific labels for HT (SMT) and Turbo (Boost) in brief
reportsystem summary field - add version to system, base board, and chassis in
reportcommand's host and system-summary tables - PerfSpect now includes additional tools used for data collection on remote ARM targets. Extends data collected by
report. Enables thetelemetryandflamecommands. Note: themetricscommand is not currently supported. - PerfSpect can now also be built to run directly on an ARM target (remote collection no longer required).
Fixes
- updated the event groups in the
metricscommand for GNR when the topdown fixed-purpose counter is not available - core temperature and frequency now shown in
telemetrywhen no uncore access available reportL3 size for AMD Turin correctly and report cache sizes in MB and per socket, consistently- fix hyperthreading enabled/disabled reporting error in
reportwhen more than half of cores are off-lined - CXL devices now correctly listed in
report - accelerator table insights in
reportcorrected - address race condition setting uncore frequencies in
configcommand - fix check for enough available storage space in
reportcommand storage benchmark
Full Changelog: v3.8.0...v3.9.0
v3.8.0
What's Changed
Version 3.8.0 is a feature and maintenance release
New Features, Changes, and Enhancements
metricscommand now supports Intel Granite Rapids processors on Google Cloud (C4 instances)metricscommand's TMA metrics for Granite Rapids updated- Network IRQs table format improved to avoid one long line of data by adding separators that will allow wrapping
metricscommand no longer errors and exits when the PMU is determined to be in use, warning is generated instead- Intel Clearwater Forest now recognized and identified by
reportandconfigcommands - Intel Granite Rapids D now recognized and identified by
reportandconfigcommands - Intel Arrow Lake CPUs now recognized and identified by
reportcommand - AWS Graviton 4, ARM Neoverse-V2 CPUs now recognized and identified by
reportcommand
Fixes
- power and temperature benchmarks in
reportcommand now works on additional architectures by fixing turbostat output parsing - NIC table fixed in
reportcommand - race condition in
configcommand fixed when setting multiple configuration options at the same time - memory benchmark in
reportcommand fixed when output format changed in newer MLC release - frequency benchmark in
reportcommand fixed when number of cores per die differs per die
Full Changelog: v3.7.0...v3.8.0
v3.7.0
What's Changed
Version 3.7.0 is a feature and maintenance release.
To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.
New Features and Enhancements
- the metrics HTML report now supports comparing two sets of metrics
- metrics command can optionally expose a Prometheus compatible metrics endpoint using --prometheus-server and --prometheus-server-addr
- flame command can now target multiple PIDs using --pids
- flame command can now control the depth of the call stack using --max-depth
- eliminated the requirement to have Perl installed on the target for the flame command
- config command can now enable/disable c6 and c1-demotion
- config command can now configure LLC size on SRF and GNR
- config command can now enable/disable LLC prefetcher on SRF
- telemetry command now reports CPU temperature, IPC and C6 residency
- report command now includes vendor and model ID in the NIC table
- logs can now be directed to stdout using --log-stdout; useful when combined with the metrics prometheus server feature
- metrics command "PMU in use" error and exit changed to a warning
Fixes
- address problems found with collecting metrics for cgroups
- fix memory benchmark chart X-axis label from MB/s to GB/s
- fix index out of range error in renderXlsxTableMultiTarget
- fix determination of availability of fixed counters
Full Changelog: v3.6.1...v3.7.0
v3.6.1
Version 3.6.1 fixes a bug found in 3.6.0 when parsing non-padded HEX values for CPU frequencies.
To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.
Full Changelog: v3.6.0...v3.6.1
v3.6.0
Version 3.6.0 is a feature and maintenance release.
To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.
New Features & Enhancements
- The CPU frequency table from the report command now includes frequencies for SSE, AVX2, AVX512, and AMX, when supported by architecture
- Flamegraphs can now be limited to a specific process (PID)
- Prefetchers can be enabled/disabled with the config command
- A brief system configuration summary table has been added to the metrics, flame, lock, and telemetry reports
- Added preliminary support for the Intel Clearwater Forest CPU architecture
- The lock command can now retrieve a binary perf package that can be used for analysis off the target
- Added support for metrics, including per-transaction metrics, on EC2 m7a (AMD Genoa) and AMD Turin
Fixes
- The config command can now set the max core frequency on SRF and GNR
- The targets.yaml file no longer requires a value for the target name field
Breaking Changes
- Some flags for the config command have been renamed for consistency and readability. See
perfspect config -h.
Full Changelog: v3.5.0...v3.6.0
v3.5.2
v3.5.2 is a bug-fix release (Note: v3.5.1 was a bad build/release and has since been deleted)
Two issues were found in 3.5.0 and are now fixed in 3.5.2.
- perfspect will exit with a panic when an incorrect command line argument is presented
- perfspect will exit with an error when falsely identifying the temp directory as being located on a file system mounted with 'noexec'
Full Changelog: v3.5.0...v3.5.2