Skip to content

Fix cDAC x-plat dump tests: pre-zip payloads and exclude osx legs#127871

Merged
max-charlamb merged 3 commits intodotnet:mainfrom
max-charlamb:fix/cdac-xplat-dump-payload-archive
May 7, 2026
Merged

Fix cDAC x-plat dump tests: pre-zip payloads and exclude osx legs#127871
max-charlamb merged 3 commits intodotnet:mainfrom
max-charlamb:fix/cdac-xplat-dump-payload-archive

Conversation

@max-charlamb
Copy link
Copy Markdown
Member

@max-charlamb max-charlamb commented May 6, 2026

Note

This PR was authored with assistance from GitHub Copilot.

Problem

The cDAC x-plat dump tests (CdacXPlatDumpTest stages) hit two distinct failure modes when running with cdacDumpTestMode=xplat:

  1. Helix SDK 2 GB MemoryStream cap. <PayloadDirectory> (DirectoryPayload.UploadAsync) zips the source directory into a MemoryStream before uploading. MemoryStream's backing array is capped at int.MaxValue (~2 GiB), so per-platform dump payloads that approach that size fail with IOException: Stream was too long.

  2. Helix host disk pressure on osx source dumps. Even with the SDK cap removed, x-plat tests download every source platform's dump artifacts onto each host. The osx_arm64 / osx_x64 payloads are large enough that the combined working set exceeds available disk and the affected work items abort with exit code -3 (Crash).

Fixes #127859.

Fix

1. Pre-zip per-platform dumps with ZipDirectory + <PayloadArchive>. ZipDirectory calls ZipFile.CreateFromDirectory, which writes the archive directly to a FileStream -- no 2 GiB cap. The Helix SDK's ArchivePayload (selected by <PayloadArchive>) uses File.OpenRead and streams the existing zip to blob storage without any in-memory buffering. CompressionLevel="Fastest" keeps the local zip step cheap; dump files don't compress meaningfully anyway.

This is the same pattern already used in src/tests/Common/helixpublishwitharcade.proj.

2. Drop osx from the x-plat dump set (TODO). Adds a separate cdacXPlatDumpPlatforms parameter to eng/pipelines/runtime-diagnostics.yml that excludes osx_arm64 / osx_x64, used by the three x-plat stages (CdacXPlatDumpGen, CdacXPlatDumpTests host platforms + artifact downloads, and the SourcePlatforms env var). Single-leg mode (cdacDumpTestMode=single-leg) still uses the full cdacDumpPlatforms list, so osx coverage there is preserved. Re-enable osx in the x-plat flow once the dump set shrinks or the Helix queues provide more disk.

Validation

  • Verified locally that ZipDirectory + target-batching produces one .zip per source platform with Overwrite="true".
  • Re-validating by re-running runtime-diagnostics with cdacDumpTestMode=xplat against this PR.

The Helix SDK's <PayloadDirectory> code path (DirectoryPayload.UploadAsync)
zips the directory into a MemoryStream before upload. MemoryStream's backing
array is capped at int.MaxValue (~2 GiB), so per-platform dump payloads that
approach that size fail in CdacXPlatDumpTest with:

    System.IO.IOException: Stream was too long.
       at System.IO.MemoryStream.set_Capacity(Int32 value)

Switch the cDAC xplat dump tests to pre-zip each per-platform dump directory
with the MSBuild ZipDirectory task and ship the resulting .zip via
<PayloadArchive>. ArchivePayload uses File.OpenRead and streams directly to
blob storage, with no in-memory buffering. CompressionLevel=Fastest keeps
the local zip step cheap; dumps don't compress meaningfully anyway.

This follows the same pattern already used in
src/tests/Common/helixpublishwitharcade.proj.

Fixes dotnet#127859

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the cDAC x-plat dump Helix submission project to avoid intermittent Helix SDK failures caused by the SDK’s <PayloadDirectory> path building a ZIP in a MemoryStream (which is capped at ~2 GiB). It pre-creates per-platform ZIPs on disk and ships them via <PayloadArchive> so upload can stream from a FileStream.

Changes:

  • Pre-zip each per-platform dump directory using MSBuild’s ZipDirectory task.
  • Switch Helix work items from <PayloadDirectory> to <PayloadArchive> pointing at the prebuilt ZIP.
  • Update inline documentation to explain the rationale and the 2 GiB cap being avoided.

The x-plat CdacXPlatDumpTests stage downloads every source platform's dump
artifacts onto each host, then runs one work item per source platform. The
osx_arm64 / osx_x64 dump payloads are large enough that the combined working
set exceeds Helix host disk space and the affected work items abort with
exit code -3 (Crash).

Add a separate cdacXPlatDumpPlatforms parameter that omits osx and use it for
the three x-plat stages (CdacXPlatDumpGen platforms, CdacXPlatDumpTests
platforms, the artifact-download/extract loops, and the SourcePlatforms env
var). Single-leg mode keeps the original cdacDumpPlatforms list, so osx
coverage is unaffected there.

osx coverage for the x-plat flow is left as a TODO referencing dotnet#127859 -
re-enable once the dump set shrinks or the Helix queues provide more disk.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@max-charlamb max-charlamb changed the title Pre-zip xplat dump payloads to bypass Helix SDK 2 GB MemoryStream cap Fix cDAC x-plat dump tests: pre-zip payloads and exclude osx legs May 7, 2026
@max-charlamb

This comment was marked as duplicate.

@max-charlamb

This comment was marked as duplicate.

@max-charlamb

This comment was marked as duplicate.

Copilot AI review requested due to automatic review settings May 7, 2026 03:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@max-charlamb
Copy link
Copy Markdown
Member Author

/ba-g cDAC pipeline only change

@max-charlamb max-charlamb disabled auto-merge May 7, 2026 03:25
@max-charlamb max-charlamb merged commit ba1cb48 into dotnet:main May 7, 2026
23 of 154 checks passed
@max-charlamb max-charlamb deleted the fix/cdac-xplat-dump-payload-archive branch May 7, 2026 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ci-scan] Test failure: CdacXPlatDumpTest IOException: Stream was too long in runtime-diagnostics (pipeline 309)

3 participants