Skip to content

ci(distroless): require explicit tag input + validate Cargo.toml matches#133

Open
BimaPangestu28 wants to merge 1 commit into
mainfrom
fix/distroless-publish-version-guard
Open

ci(distroless): require explicit tag input + validate Cargo.toml matches#133
BimaPangestu28 wants to merge 1 commit into
mainfrom
fix/distroless-publish-version-guard

Conversation

@BimaPangestu28
Copy link
Copy Markdown
Member

Summary

Harden the publish-distroless workflow so a stale build can never own the :latest tag.

Why

Post-mortem from a 1.5-hour debug session: a Fargate deploy pinned ghcr.io/greenticai/greentic-start-distroless@sha256:57c381e… thinking it was v0.5.18. Extracting the binary and running --version showed it was actually greentic-start 0.4.49 (built April 10) — the warmup auto-adopt code didn't even exist in that build. Meanwhile :v0.5.18 resolved to a different digest with the correct binary.

Root cause: workflow_dispatch had no inputs and no validation. Anyone (or any automation) could trigger the workflow on an old branch, the metadata-action would emit type=raw,value=latest regardless, and the resulting build would silently take ownership of :latest. The deployer's DEFAULT_GHCR_OPERATOR_IMAGE constant was effectively pointing at whatever was last manually dispatched, not whatever was last released.

Changes

  1. workflow_dispatch now requires an inputs.tag that must match ^v\d+\.\d+\.\d+([+-].+)?$. The workflow checks out that exact tag rather than github.ref (which on dispatch is the branch the dispatch ran from, not the release tag).
  2. New "Validate Cargo.toml version matches tag" step: if Cargo.toml.version ≠ tag's stripped version, the workflow fails before pushing. Stops mismatched builds from ever owning :latest.
  3. concurrency.group now keys on tag (and dispatch input) so two dispatches for different tags don't fight; cancel-in-progress: false so an in-flight build can't be silently overwritten by a later dispatch.

Test plan

  • YAML parses cleanly (python3 -c "import yaml; yaml.safe_load(...)")
  • Smoke: push a v0.5.18 tag (existing) — should succeed unchanged because Cargo.toml is on 0.5.18
  • Smoke: workflow_dispatch with tag=v0.5.18 from main with Cargo.toml at 0.5.x → succeed
  • Smoke: workflow_dispatch with tag=v0.4.49 from main with Cargo.toml at 0.5.x → fail at validation step (this is the bug we're fixing)

The third smoke test reproduces the exact failure mode that triggered this PR.

Bug post-mortem: an unguarded `workflow_dispatch` allowed `:latest` to
be reassigned to whatever HEAD was checked out, even when that HEAD's
Cargo.toml version did not match a published v* tag. We hit this when
a deployer pinning resolved `:latest` to an old v0.4.49 build that had
been silently retagged on top of v0.5.x, surfacing as a "warmup
auto-adopt is broken on Fargate" red-herring debug session.

Changes:
- workflow_dispatch now requires an `inputs.tag` matching `v<MAJOR>.<MINOR>.<PATCH>`,
  and the workflow checks out that exact tag (no more "build from
  whatever HEAD happens to be").
- New "Validate Cargo.toml version matches tag" step refuses to push
  if Cargo.toml version != tag version. Stops mismatched builds from
  ever owning the `:latest` tag.
- concurrency group now keys on tag so two dispatches for different
  tags don't fight each other; cancel-in-progress disabled to prevent
  silent overwrites of an in-flight build.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant