Skip to content

fix(dotnet): stop duplicating failures on failing test runs (#2501)#2502

Merged
aeppling merged 2 commits into
rtk-ai:developfrom
Tailoo:fix/2501-dotnet-test-failure-dedup
Jun 22, 2026
Merged

fix(dotnet): stop duplicating failures on failing test runs (#2501)#2502
aeppling merged 2 commits into
rtk-ai:developfrom
Tailoo:fix/2501-dotnet-test-failure-dedup

Conversation

@Tailoo

@Tailoo Tailoo commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #2501 — on failing dotnet test runs the orchestrator prepended the full raw stdout ahead of the filtered summary. The Failed Tests: section already reproduces each failure (name + message + clipped stack) parsed from TRX/console, so every failure was printed twice — inflating output +65% vs raw, scaling linearly with failure count.

Fix

Gate the raw-stdout prepend behind a new test_needs_raw_fallback:

  • Skip the prepend when the structured Failed Tests: section already carries detail → dedup.
  • Keep it when the filter is blind: no failures parsed (build failure / crash where nothing reaches failed_tests), or a parsed failure has empty detail (e.g. a self-closing <UnitTestResult> with no <ErrorInfo>). Nothing is lost.

build / restore / passthrough keep prior behavior (needs_raw_fallback = true).

Why this shape

The bug lived in run_dotnet_with_binlog — the one dotnet layer with no test coverage (it spawns real dotnet). Consistent with the existing dotnet test pattern, the decision is extracted into two pure, unit-tested fns rather than adding a net-new orchestration/e2e harness (no dotnet filter has one):

  • test_needs_raw_fallback(&summary) -> bool
  • compose_failure_output(success, needs_raw_fallback, stdout, stderr, filtered) -> String

Regression analysis (traced through code)

Scenario failed_tests Behavior Result
Normal failing test (detail parsed) populated, detail present drop raw prepend deduped, detail preserved
Failure with empty details populated, detail empty keep raw fallback raw message not lost
Compile/build failure, no tests ran empty keep raw fallback build errors preserved
Crash / no TRX empty keep raw fallback raw preserved
counts_unavailable empty keep raw fallback raw preserved

Warnings: / Errors: sections come from the build summary, independent of the prepend → still shown once.

Tests

7 new unit tests (all green), covering the dedup decision and every fallback branch. cargo fmt + cargo clippy --all-targets clean, cargo test --all passes.

🤖 Generated with Claude Code

On failing `dotnet test` runs the orchestrator prepended the full raw
stdout ahead of the filtered summary. The `Failed Tests:` section already
reproduces each failure (name + message + clipped stack) parsed from
TRX/console, so every failure was printed twice — inflating output +65%
vs raw, scaling linearly with failure count (issue rtk-ai#2501).

Gate the raw-stdout prepend behind `test_needs_raw_fallback`: skip it when
the structured section carries detail, keep it when the filter is blind
(no failures parsed, or a parsed failure has empty detail — e.g. a
self-closing <UnitTestResult> with no <ErrorInfo>, or a build failure /
crash where nothing reaches failed_tests).

Extracts two pure, unit-tested fns (`test_needs_raw_fallback`,
`compose_failure_output`) so the orchestration decision is covered without
running dotnet.

Closes rtk-ai#2501

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@CLAassistant

CLAassistant commented Jun 18, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@aeppling aeppling self-assigned this Jun 19, 2026
@aeppling

Copy link
Copy Markdown
Contributor

Hey @Tailoo , thanks for addressing this issue.

Another concurrent PR #2511 , in favor of this one for the cleaner factoring.

Blocker before merge: test_needs_raw_fallback drops the raw fallback whenever every listed failure carries detail, but never checks the list is complete against summary.failed.

Fix: keep the raw fallback when the structured section is numerically incomplete, add a failed_tests.len() < summary.failed clause to the condition.

test_needs_raw_fallback dropped the raw stdout prepend whenever every
parsed failure carried detail, but never checked the list was complete
against summary.failed. A run reporting 5 failures but parsing only 3
detailed blocks would silently lose the 2 unparsed failures.

Add a `failed_tests.len() < summary.failed` clause so the raw fallback is
kept when the structured section is numerically incomplete. Add a
regression test and fix an inconsistent existing fixture (failed: 5 with a
single parsed entry) that the new clause correctly flagged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@aeppling

Copy link
Copy Markdown
Contributor

Hey, this look good to me, thanks for contributing to RTK :)

@aeppling aeppling merged commit 6946bf9 into rtk-ai:develop Jun 22, 2026
11 checks passed
@aeppling aeppling mentioned this pull request Jun 22, 2026
@Tailoo Tailoo deleted the fix/2501-dotnet-test-failure-dedup branch June 22, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dotnet test filter duplicates failure output on failing runs

3 participants