Skip to content

test: do not run criterion benchmarks in no_block_pr #5273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Manciukic
Copy link
Contributor

Changes

With this change, we are disabling them in the PR - Optional step. We will still be able to notice performance regressions from the performance tests, so we're not losing much test coverage. We're not completely deleting them because they can be useful for deep dives and for verifying changes in those particular parts of the codebase.

Reason

These tests have been failing consistently in our CI for as long as I can remember. While each individual test false positive rate is not high, the fact that we run many of them in a multitude of combinations means that at every CI run at least one fails.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

These tests have been failing consistently in our CI for as long as I
can remember. While each individual test false positive rate is not
high, the fact that we run many of them in a multitude of combinations
means that at every CI run at least one fails.

With this change, we are disabling them in the PR - Optional step. We
will still be able to notice performance regressions from the
performance tests, so we're not losing much test coverage.
We're not completely deleting them because they can be useful for deep
dives and for verifying changes in those particular parts of the
codebase.

Signed-off-by: Riccardo Mancini <[email protected]>
Copy link

codecov bot commented Jun 19, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.90%. Comparing base (d5f3513) to head (bfa6545).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5273      +/-   ##
==========================================
+ Coverage   82.84%   82.90%   +0.05%     
==========================================
  Files         250      250              
  Lines       26967    26967              
==========================================
+ Hits        22342    22356      +14     
+ Misses       4625     4611      -14     
Flag Coverage Δ
5.10-c5n.metal 83.33% <ø> (+<0.01%) ⬆️
5.10-m5n.metal 83.33% <ø> (+<0.01%) ⬆️
5.10-m6a.metal 82.54% <ø> (-0.01%) ⬇️
5.10-m6g.metal 79.16% <ø> (ø)
5.10-m6i.metal 83.33% <ø> (+<0.01%) ⬆️
5.10-m7a.metal-48xl 82.54% <ø> (?)
5.10-m7g.metal 79.16% <ø> (ø)
5.10-m7i.metal-24xl 83.29% <ø> (?)
5.10-m7i.metal-48xl 83.28% <ø> (?)
5.10-m8g.metal-24xl 79.16% <ø> (?)
5.10-m8g.metal-48xl 79.16% <ø> (?)
6.1-c5n.metal 83.38% <ø> (+<0.01%) ⬆️
6.1-m5n.metal 83.38% <ø> (+<0.01%) ⬆️
6.1-m6a.metal 82.60% <ø> (+<0.01%) ⬆️
6.1-m6g.metal 79.16% <ø> (-0.01%) ⬇️
6.1-m6i.metal 83.36% <ø> (-0.01%) ⬇️
6.1-m7a.metal-48xl 82.58% <ø> (?)
6.1-m7g.metal 79.16% <ø> (ø)
6.1-m7i.metal-24xl 83.40% <ø> (?)
6.1-m7i.metal-48xl 83.40% <ø> (?)
6.1-m8g.metal-24xl 79.16% <ø> (?)
6.1-m8g.metal-48xl 79.16% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Manciukic Manciukic marked this pull request as ready for review June 19, 2025 13:18
@Manciukic Manciukic added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Jun 19, 2025
Copy link
Contributor

@kalyazin kalyazin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unusually green

Copy link
Contributor

@roypat roypat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should just delete the python test altogether

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Awaiting review Indicates that a pull request is ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants