Improve benchmarking for better view into perf improvements and regressions #52
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TL;DR - What are you trying to accomplish?
This PR is one step towards improving the benchmarking tools available for evaluating potential performance improvements or catching regressions.
Details - How are you making this change? What are the effects of this change?
This PR uses
iai-callgrind
to perform analysis on CPU-level instructions, cache hits, I/O and CPU cycle estimation. Docs can be found here. This is configured to run in CI in a way that compares incoming changes against baseline performance. This level of analysis is more reliably consistent than others (e.g. wall-clock or stack-tracing benchmarks) and should be appropriate to run as part of our CI chain to help flag any potential issues with new additions to the crate. Wall-clock benchmarking throughcriterion-rs
and stack-trace benchmarking throughpprof
orsamply
are both still valuable and coverage will be enhanced through those tools, but they are better left for use in a local environment.