Releases: intel/PerfSpect
v3.5.1
v3.5.1 is a bug-fix release
Two issues were found in 3.5.0 and are now fixed in 3.5.1.
- perfspect will exit with a panic when an incorrect command line argument is presented
- perfspect will exit with an error when falsely identifying the temp directory as being located on a file system mounted with 'noexec'
Full Changelog: v3.5.0...v3.5.1
v3.5.0
Version 3.5.0 is a feature and maintenance release with the following additions/fixes.
Breaking Change
- The --targettemp flag has been removed. Use the --tempdir flag to override the directory where collection scripts are executed.
New Features & Enhancements
- The --txnrate flag used with the metrics command now augments the metrics list with transaction-oriented metrics rather than replacing existing metrics.
- The --syslog flag redirects log output to the local syslog daemon. This is useful when running PerfSpect for long durations and/or running as a CRON job.
- Improved shutdown when PerfSpect receives SIGINT (ctrl-c).
- Added GNR prefetcher settings to report.
- Added clustering mode (SNC, UMA) for GNR and SRF to report.
- Added CPU frequency chart to telemetry report.
- Added table of network-related kernel parameters to report.
- Added TME (total memory encryption) on/off to report.
- Added TMA level 1 over time chart to the metrics HTML report.
- Added configured DIMM speed and DIMM rank to DIMM table.
Fixes
- Addressed incorrect measured CPU frequency chart on GNR when SNC is disabled.
- Addressed missing NIC information in report.
- Addressed error when /tmp is on a file system mounted with 'noexec' (use --tempdir to override).
- Addressed incorrect memory channels listed for SRF-AP.
Full Changelog: v3.4.0...v3.5.0
v3.4.0
Version 3.4.0 is a feature and maintenance release with the following additions/fixes.
New Features & Enhancements
- Gaudi device stats now included in the
telemetry
command report. Metrics
command event data can now be re-processed so that a previously unknown transaction rate (--txnrate) can be applied.- The
telemetry
command now accepts a duration value of zero (--duration 0) to run until interrupted by SIGINT (ctrl-c). - The
telemetry
command HTML report now includes time stamps on the x-axis of charts. - The
config
command now allows setting the compute and I/O die frequencies independently (SRF and GNR) - The branch misprediction metric was added to the
metrics
report. - The
report
command now includes the Speed Select Technology frequency table when it is enabled. - Added insight entry to
report
command to warn when ELC is configured in latency-optimized mode and EPB is non-zero. - The
report
andconfig
commands now determine which EPB configuration value (OS or BIOS) is active and report and/or change the appropriate entry. Report
command tables that are not relevant to a given CPU architecture are now not include in the output.
Fixes
- L3 per core reported by the
report
command was inaccurate on some CPU architectures - On multi-socket systems where a socket has been disabled via BIOS, the microarchitecture may be reported incorrectly.
What's Changed
- enable post-processing of pre-collected metric events by @harp-intel in #192
- enable indefinite duration for telemetry collection by @harp-intel in #203
- show timestamps in metrics summary and telemetry charts by @harp-intel in #205
- refactor html report generation to reduce duplication by @harp-intel in #206
- add branch mispredict ratio metric by @harp-intel in #207
- use remote target's perf for metrics collection if it is installed and new enough by @harp-intel in #208
- Highlight notes, tips, and warnings in README by @harp-intel in #209
- Bump github.com/spf13/cobra from 1.8.1 to 1.9.1 by @dependabot in #210
- add speed select turbo frequency tables by @harp-intel in #211
- fix report for l3 size per core when l3 instances are used by multipl… by @harp-intel in #213
- Get and set compute and I/O die max/min frequencies independently by @harp-intel in #216
- add example output images to README by @harp-intel in #218
- refactor scripts to use templating by @harp-intel in #222
- use alternate EPB value when configured to do so by @harp-intel in #223
- report tables associated with CPU models by @harp-intel in #225
- fix GNR_X* microarchitecture detection by @harp-intel in #227
- collect and report Gaudi device telemetry by @harp-intel in #217
Full Changelog: v3.3.1...v3.4.0
v3.3.1
Maintenance/Bug Fix Release
What's Changed
- Bump golang.org/x/term from 0.28.0 to 0.29.0 by @dependabot in #193
- Bump golang.org/x/text from 0.21.0 to 0.22.0 by @dependabot in #194
- fix regression in metrics command causing error on some AWS m5 and m6 instance types by @harp-intel in #196
- address race condition in metrics event processing by @harp-intel in #198
- process last set of events consistently by @harp-intel in #199
- send sigkill to child processes when receive sigint by @harp-intel in #201
Full Changelog: v3.3.0...v3.3.1
v3.3.0
Features/Enhancements:
- add instruction mix reporting to telemetry command by @harp-intel in #179
- add storage performance benchmark to 'report' command by @harp-intel in #161
- add 2nd level TMA metrics to chart by @harp-intel in #165
- add metadata tab to metrics summary HTML report by @harp-intel in #171
Maintenance/Bug Fixes:
- accept metric list without including "metric_" prefix by @harp-intel in #155
- add support for customized "no data found" message for any table by @harp-intel in #156
- recognize L3 cache size in GiB by @harp-intel in #158
- add cpu def for Turin zen 5c by @harp-intel in #160
- add benchmark descriptions by @harp-intel in #163
- add pid to ssh control master file name so that it doesn't get reused… by @harp-intel in #169
- if instructions event cannot be collected then assume target is not s… by @harp-intel in #170
- remove parse error from report, add to log by @harp-intel in #175
- provide sudo password to script when needed by @harp-intel in #177
- change flame and lock help to show 'all' is default format option by @harp-intel in #184
- fix bug where higher granularity metrics are not properly printed by @harp-intel in #182
- don't use bc in script as it is not available, by default, on some Linux OS distributions by @harp-intel in #189
- Bump github.com/spf13/pflag from 1.0.5 to 1.0.6 by @dependabot in #190
- fix metrics socket and cpu granularity metric calculations by @harp-intel in #191
Full Changelog: v3.2.0...v3.3.0
v3.2.0
What's Changed
Features/Enhancements:
- support for GCP C4 instances by @harp-intel in #134
- add AMD Turin CPU identifier by @harp-intel in #127
- make sure PMUs are not in use when running the metrics command by @harp-intel in #144
- Enable interruption of PerfSpect with SIGINT (ctrl-c) when collecting data over SSH by @harp-intel in #145
- limit config flags to specific uarchs by @harp-intel in #149
Maintenance/Bug Fixes:
- consider pkg control value when getting and setting EPP by @harp-intel in #142
- clean up benchmark summary table when no min latency collected by @harp-intel in #129
- perfspect report flag 'all' is true by default by @harp-intel in #128
- assume events not supported on failure when loading metadata for metrics command by @harp-intel in #139
- Bump golang.org/x/term from 0.27.0 to 0.28.0 by @dependabot in #140
- fix metrics HTML title field by @harp-intel in #147
- filter out infinite values in metrics summary by @harp-intel in #152
- Update README.md by @HarpPDX in #136
Full Changelog: v3.1.0...v3.2.0
v3.1.0
Version 3.1.0 is a feature and maintenance release with the following additions/fixes.
New Features
- The 'lock' command was added to support the analysis of lock contention on high core count servers.
- L3 cache size reported in system summary table of perfspect 'report'
Fixes
- Add timeout to perspect 'report' data collection commands that run too long
- L3 cache size was reported as total from system. Changed to cache size per processor.
- The 'metrics' command was occasionally erroneously detecting the lack of support for some events.
What's Changed
- Bump github.com/deckarep/golang-set/v2 from 2.6.0 to 2.7.0 by @dependabot in #108
- don't prepend sudo to command if user is superuser by @harp-intel in #110
- Bump golang.org/x/term from 0.26.0 to 0.27.0 by @dependabot in #111
- Bump golang.org/x/text from 0.20.0 to 0.21.0 by @dependabot in #112
- build perf with support for bpf by @harp-intel in #117
- Add support for kernel lock analysis by @TianyouLi in #114
- timeout on commands by @harp-intel in #115
- report l3 size per socket by @harp-intel in #120
- Bump golang.org/x/crypto from 0.28.0 to 0.31.0 by @dependabot in #121
New Contributors
- @TianyouLi made their first contribution in #114
Full Changelog: v3.0.1...v3.1.0
v3.0.1
Version 3.0.1 is a maintenance release that addresses a few issues/bugs:
- improve error handling and messaging when a target system does not support metrics collection
- stop metrics collection and produce output files when SIGINT is sent only to the perfspect process
- abbreviate uncore event names to shorten the perf stat command line arguments on target systems with a high core/uncore device count in order to fit within the bash argument size limits
- re-enable NMI watchdog after metrics collection on remote targets
What's Changed
- fix build badge in readme by @harp-intel in #84
- pin build process dependencies by @harp-intel in #86
- address issues raised by golangci-lint by @harp-intel in #88
- add no root requirements for metrics to readme by @harp-intel in #91
- Bump golang.org/x/term from 0.25.0 to 0.26.0 by @dependabot in #90
- Bump golang.org/x/text from 0.19.0 to 0.20.0 by @dependabot in #89
- document --noroot option for metrics by @harp-intel in #92
- Improve error handling when cannot connect to target(s) and when target doesn't meet requirements for metrics command by @harp-intel in #95
- send SIGINT and SIGTERM to children when received by @harp-intel in #97
- show flame and telemetry duration in status by @harp-intel in #98
- build tarball without version in name by @harp-intel in #100
- pin systat version by @harp-intel in #106
- abbreviate uncore event names to shorten the length of the arguments on the perf command line by @harp-intel in #102
- re-enable NMI watchdog when disabled for remote targets by @harp-intel in #104
Full Changelog: v3.0.0...v3.0.1
v3.0.0
PerfSpect 3.0 is a new design with many new features. Please see the README for a description of the features and for hints that will help with migration from earlier releases of PerfSpect.
Summary of changes:
- PerfSpect 3.0 produces user friendly metrics in a single invocation. The post-processing step required in earlier releases has been eliminated.
- PerfSpect 3.0 optionally produces "live" metrics.
- PerfSpect 3.0 now includes features previously only available in Intel System Health Inspector, AKA svr-info, including system configuration reports, benchmarks, telemetry, and flamegraphs.
- PerfSpect 3.0 can operate on the local host and/or 1 or more remote hosts.
Please see the README for more details. And, for a complete list of available features, explore the application's help system, e.g., perfspect metrics -h.
Please post Questions, Bug Reports, and Feature Requests using GitHub Issues, here: https://github.com/intel/PerfSpect/issues.
v1.5.0
This release adds basic metrics support for Intel "Granite Rapids" processors. Full support, including TMA metrics, will come in a follow-up release.