Skip to content

ci: unified tag-triggered release pipeline with SLSA provenance on the Release (Scorecard Signed-Releases → 10)#91

Merged
project-navi-bot merged 10 commits into
mainfrom
ci/unified-release-provenance
May 28, 2026
Merged

ci: unified tag-triggered release pipeline with SLSA provenance on the Release (Scorecard Signed-Releases → 10)#91
project-navi-bot merged 10 commits into
mainfrom
ci/unified-release-provenance

Conversation

@Fieldnote-Echo

@Fieldnote-Echo Fieldnote-Echo commented May 28, 2026

Copy link
Copy Markdown
Owner

Summary

Surfaced by @Signal-Ridge-SysAdmin's external review of the published wheel: the bindings declared NumPy parameters with strict dtypes, so the dtypes NumPy produces by default raised opaque TypeErrors. That reported papercut generalised into two coherent boundary contracts and a unified, tag-triggered release pipeline that automates the build/attest/attach steps v0.2.0 had to do by hand (which is how its attestations got dropped → Scorecard Signed-Releases = 0):

vectors:        float-only, C-contiguous, finite after f32 coercion   (as_f32_1d / as_f32_2d)
candidate IDs:  integer labels, range-safe-coerced to u32             (as_u32_ids_1d)
release flow:   build → attest → SLSA provenance → stage on DRAFT → gated publishes →
                 byte-identity check (pre+post) → un-draft only after BOTH publishes succeed

The split is deliberate: vector values define ordinal structure (bool/narrow-int must be rejected — they'd rank-transform to garbage); candidate IDs are labels (so int64 must be accepted — rejecting it is hostile). And the release flow refuses to make a public GitHub Release for a version the registries refused.

Commit 1 — candidate IDs (as_u32_ids_1d)

int64 candidate arrays (NumPy's default for np.arange, np.where()[0], fancy-indexing, np.argpartition) used to bounce off with TypeError: 'ndarray' object cannot be cast as 'ndarray'. Now: accept any range-safe integer dtype (negatives, >= 2**32, non-integer → fail-loud ValueError/TypeError); uint32 still borrowed zero-copy.

Commit 2 — embeddings (as_f32_1d / as_f32_2d)

Every f32 embedding param was strict, so float64 (NumPy's default) and float16 raised TypeError. ordvec's premise is float vector in → rank/sign transform; f32 is the internal working dtype, not a caller contract.

  • Coerce float16/float32/float64 → float32 (rank/sign transforms are order-only; f64→f32 rounding is monotonic; the asym LUT scores against f32-quantised docs).
  • Reject bool + integer dtypes (silent degenerate-rank artefact); reject complex/object/string and wrong ndim.
  • Reject non-C-contiguous BEFORE coercion (a transposed f64 is never silently laundered into a contiguous f32 — the copy decision stays with the caller).
  • All-finite check AFTER coercion (catches f64 > f32::MAX+inf).

Commits 3+ — unified release pipeline (release.yml)

Replaces changelog.yml + release-crate.yml + release-python.yml with one tag-triggered workflow whose job graph forces the assemble-then-publish coordination the old split-workflow design left to a manual step.

Provenance / attestation, soup to nuts (all genuine, nothing faked):

Layer Tool (pinned) Lands Buys
SLSA provenance slsa-github-generator@v2.1.0, upload-assets: false (a workflow-artifact *.intoto.jsonl is the SOLE provenance file, attached via the single release-assets writer) multiple.intoto.jsonl on the Release Scorecard provenance = 10 + SLSA L3
GitHub attestation actions/attest-build-provenance@v4.1.0 attestation store + *.sigstore.json on Release gh attestation verify, signing-probe backup
PyPI pypa/gh-action-pypi-publish@v1.14.0 (attestations default ON) PyPI Integrity API PEP 740
crates.io rust-lang/crates-io-auth-action@v1.0.4 (Trusted Publishing OIDC) no stored token

Job graph (on tag-push of a strict vX.Y.Z):

tag vX.Y.Z
  guard (strict SemVer)
  require-ci-green   (ci.yml + python.yml + fuzz.yml + codeql.yml green on main for this SHA)
  notes              (git-cliff → DRAFT GitHub Release with notes)
  build-crate · build-wheels (4 legs) · build-sdist
  attest             → GitHub attestation store + ordvec-<v>.sigstore.json (artifact)
  combine-hashes → provenance (slsa-github-generator, upload-assets:false → multiple.intoto.jsonl artifact)
  release-assets-draft  (stages .crate + .whl + .tar.gz + .sigstore.json + .intoto.jsonl on DRAFT; DOES NOT un-draft) [automated]
  ├─ publish-crate   environment: crates-io   (Required reviewer) ← GATED
  │     • download attested `dist-crate`
  │     • cargo package -p ordvec --locked → sha256 == attested?  (pre-publish fail-closed BEFORE OIDC token mint)
  │     • crates-io-auth-action (OIDC) → cargo publish -p ordvec --locked
  │     • curl https://crates.io/api/v1/crates/ordvec/<v>/download → sha256 == attested?  (post-publish empirical proof; fail-closed)
  └─ publish-pypi    environment: pypi         (Required reviewer) ← GATED
        • Trusted Publishing (OIDC); PEP 740 attestations on by default
  publish-github-release  needs:[publish-crate, publish-pypi]  → `gh release edit <tag> --draft=false`
        (SOLE un-draft point — if either publish fails, Release stays DRAFT)

What this PR fixes about how v0.2.0 broke

  1. Manual asset attach dropped attestationsrelease-assets-draft is a single coordinated writer with explicit needs: edges; no human in the asset-attach loop.
  2. Scorecard Signed-Releases = 0 (no .intoto.jsonl on the Release) → SLSA generator's signed multiple.intoto.jsonl is attached automatically. Scorecard floor-averages 5 releases, so the score lifts at the next stable tag.
  3. Public GitHub Release could exist for a version the registries refused → un-draft is in its own job downstream of both publishes; failed publish ⇒ Release stays DRAFT.
  4. cargo publish could upload a different .crate than the SLSA-attested onepublish-crate proves byte-identity twice: a pre-publish cargo package sha256-compare (fast-fail before OIDC token mint), and a post-publish curl crates.io/.../download sha256-compare (empirical proof the bytes crates.io actually serves equal the attested bytes; mismatch fails the job → no un-draft).

Full OIDC — no stored tokens

Already the case and carried over: crates.io via rust-lang/crates-io-auth-action (ephemeral token), PyPI via Trusted Publishing. There is no CARGO_REGISTRY_TOKEN secret anywhere.

Anti-regression — tests/release_signed_release_invariants.sh

Structural lint over release.yml, wired into ci.yml's release-guard (runs every push/PR). Asserts the whole signed-release graph: release-assets-draft needs attest+provenance+CI, uploads every required suffix, does NOT un-draft; SLSA generator tag-pinned with upload-assets: false and a *.intoto.jsonl provenance-name; attest grants id-token+attestations:write; both publishes grant id-token:write and need the draft assets; publish-crate actually performs the pre-publish AND post-publish byte-identity checks; publish-github-release needs both publishes and is the only un-draft point.

A future PR that silently weakens any of these edges fails CI here, not at the next release.

⚠️ REQUIRED before the first real tag (all fail CLOSED until done — zero risk if missed)

What Where
1 Trusted Publisher → workflow = release.yml crates.io publisher config
2 Trusted Publisher → workflow = release.yml PyPI publisher config
3 Env "Deployment branches and tags" → tag pattern v[0-9]*.[0-9]*.[0-9]* GitHub Environments crates-io and pypi (the old main-branch-only setting from the dispatch model would now deadlock publishing — the workflow runs on refs/tags/..., never on refs/heads/main)
4 Fork dry-run + Scorecard Signed-Releases check against the fork locally / on a fork (the fork physically cannot publish — OIDC bound to Fieldnote-Echo/ordvec)

Test plan

  • actionlint clean; YAML parses; zizmor clean (No findings to report).
  • tests/release_publish_invariants.sh OK (SBOM cleanup ordering preserved).
  • tests/release_signed_release_invariants.sh OK (signed-release graph pinned).
  • pytest ordvec-python/tests491 passed (the dtype-boundary commits' coverage).
  • Two code-reviewer passes (one per dtype-boundary commit): APPROVE, 0 findings.
  • Fork dry-run validates the SLSA-generator-artifact → release-assets-draft handoff end-to-end and confirms Signed-Releases hits 10 against the fork's Release.

Notes

  • The SLSA reusable workflow is tag-pinned (@v2.1.0), not SHA-pinned — mandatory for its self-verification trust model; carries # zizmor: ignore[unpinned-uses].
  • Trigger is tag-only (v[0-9]*.[0-9]*.[0-9]*); the guard job enforces strict SemVer.
  • Follow-ups deliberately scoped out: post-release verification job for PyPI side (downloading from the Integrity API), harden-runner block mode (audit → block once an allowlist is observed), strict "tag SHA == main HEAD" (today require-ci-green enforces "SHA must be on main with green push-event CI"), and promoting the optional tag ruleset to mandatory.

Credit

Reported by @Signal-Ridge-SysAdmin, who reviewed the published wheel and caught the candidate-ID dtype papercut (and the broader "should this keep the floats at all?" question that motivated the embedding contract). Release-pipeline rigour (byte-identity, fail-closed un-draft, anti-regression invariants) shaped by grumpy's adversarial review.

🤖 Generated with Claude Code

Fieldnote-Echo and others added 4 commits May 27, 2026 15:01
RankQuant.search_asymmetric_subset and Bitmap.body_overlap_scores_subset
declared their candidate/doc-id arrays as PyReadonlyArray1<u32>, which
rust-numpy matches strictly. The int64 arrays NumPy produces by default
(np.arange, np.where()[0], np.array([...]), fancy indexing, np.argpartition)
therefore raised an opaque "ndarray cannot be cast as ndarray" TypeError.
ordvec's own top_m_candidates* emit uint32, so the happy path worked and the
suite never exercised a user-built candidate set.

Accept any integer dtype and convert to the core's u32 with checked bounds:
negatives and values >= 2^32 raise a clear ValueError rather than silently
wrapping (np.asarray(-1, uint32) -> 4294967295, 2**32 -> 0, which would score
the wrong document). Already-uint32 contiguous arrays are still borrowed
zero-copy; other dtypes are copied once. The >= n bound check and the
body_overlap sorted-ids policy are unchanged.

Flips the two red-team tests that asserted the old strict-uint32 rejection to
the new contract (integer dtypes converted by value, non-integer dtypes still
TypeError) and adds dtype-matrix + fail-loud regression tests. 390 pytest pass.

Reported via external review: candidate ids naturally arrive as int64.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…d contract

Every embedding parameter was declared PyReadonlyArray{1,2}<f32>, which rust-numpy
matches strictly — so float64 (NumPy's default for np.array([...]) and most API
embeddings) and float16 raised an opaque TypeError. The premise of ordvec is
'float vector in -> rank/sign transform', so float32 is the internal working dtype,
not a contract the caller must pre-satisfy.

Add two choke-point helpers, as_f32_1d / as_f32_2d, that every embedding entry point
(18 methods + free functions) routes through, so dtype/layout/finite policy is
defined once:
  - coerce float16/float32/float64 -> float32. The rank/sign transforms are
    order/sign-only and f64->f32 rounding is monotonic, so coercion is faithful; the
    asymmetric LUT scores against f32-quantised docs, so sub-f32 query precision is
    meaningless there too.
  - REJECT bool + all integer dtypes (TypeError): a {0,1}/narrow-int vector
    rank-transforms to a degenerate index-tie artefact (silent retrieval garbage) — a
    deliberate usage-error guard, not an ergonomic gap.
  - reject complex/object/string (complex would silently drop the imaginary part).
  - reject wrong ndim (TypeError).
  - reject non-C-contiguous input (ValueError) BEFORE coercion, so a transposed
    float64 is never silently laundered into a contiguous float32 (a hidden copy can
    dominate runtime / poison benchmarks — the copy decision stays with the caller).
  - all-finite check AFTER coercion (f64 > f32::MAX rounds to +inf — caught here).

Candidate IDs are a different boundary (labels, not measurements), so they keep the
permissive contract: rename coerce_candidate_ids -> as_u32_ids_1d, accept any
range-safe integer dtype (int64 included), reject bool/float/negative/overflow with
sharper messages. Coherent split: vectors = float-only/C-contiguous/finite;
candidate IDs = integer labels range-safe-coerced to u32.

Core, persistence (.tvr/.tvrq/.tvbm/.tvsb store no floats), and the integer
primitives are untouched. Inverts the 5 tests that asserted the old strict-f32
rejection (now coerced+faithful); adds test_input_dtype.py covering the full
accept/reject matrix across all four index types. 483 pytest pass; fmt + clippy
-D warnings clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
The choke-point helpers coerced non-float32 input to float32 (a full copy) before
the caller's width check ran, so a wrong-width float64 array was fully converted
only to be rejected — wasteful, and a potential OOM on a large misshapen input.

Move the width check into as_f32_1d / as_f32_2d via a cheap shape-metadata read
(axis_len), so dtype -> ndim -> width -> contiguity are all validated on the
ORIGINAL array before the ascontiguousarray copy. as_f32_2d takes the expected
dim; as_f32_1d takes Option<usize> (rank_transform has no width constraint). Adds
a wrong-width-float64 regression across all four index types. 491 pytest pass;
fmt + clippy -D warnings clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…e Release

Replaces changelog.yml + release-crate.yml + release-python.yml with one
tag-triggered release.yml. Cutting a stable vX.Y.Z tag now fully automates
build (crate + wheels + sdist) -> GitHub artifact attestation -> SLSA Build-L3
provenance -> attach ALL assets (incl. multiple.intoto.jsonl + a .sigstore.json
bundle) to the GitHub Release -> un-draft. Only the crates.io + PyPI publishes
are gated (Environments with Required Reviewers). Full OIDC, no stored tokens.

Fixes OpenSSF Scorecard Signed-Releases = 0: v0.2.0's assets were attached by
hand and the attestations dropped. Scorecard's probes read ONLY GitHub
release-asset filenames (.intoto.jsonl -> 10, .sigstore.json -> 8; registry/PEP740
ignored by design). The slsa-github-generator multiple.intoto.jsonl gives the
provenance probe its 10; the attest-build-provenance .sigstore.json is a backup
signing-probe asset + powers 'gh attestation verify'. A single release-asset
writer (release-assets) + explicit needs: edges remove the cross-workflow
coordination that forced the manual attach. Fail-closed: publishes need attest
+ provenance.

REQUIRES before the first real tag (both fail CLOSED at the gate until done):
(1) crates.io + PyPI Trusted-Publisher configs re-pointed to workflow=release.yml
(env stays crates-io / pypi); (2) a fork dry-run to validate end-to-end — a fork
cannot publish (Trusted Publishing OIDC is bound to this repo). actionlint clean;
NOT yet run against a real tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@qodo-code-review

Copy link
Copy Markdown

Review Summary by Qodo

Normalize vector/id input dtypes at FFI boundary; unified release pipeline with SLSA provenance

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Accept any integer dtype for candidate/doc-id arrays, converting to u32 with checked bounds
  - Eliminates opaque TypeError when passing NumPy's default int64 index arrays
  - Negative ids and values >= 2^32 raise clear ValueError; already-uint32 arrays borrowed zero-copy
• Normalize float vector input to f32 at FFI boundary, accepting float16/float32/float64
  - Coerces NumPy's default float64 to float32; f64→f32 rounding is monotonic and lossless for
  rank/sign transforms
  - Rejects bool/integer/complex/object/string arrays deliberately (degenerate index artifacts)
• Add comprehensive dtype/layout boundary tests covering acceptance, rejection, and coercion
• Unified tag-triggered release pipeline with SLSA Build-L3 provenance and GitHub attestations
  - Replaces three independent workflows (changelog/release-crate/release-python) with single
  release.yml
  - Coordinates build → attest → provenance → Release assets → un-draft → gated publishes
Diagram
flowchart LR
  A["NumPy arrays<br/>any float/int dtype"] -->|as_f32_1d/2d| B["float32<br/>coercion"]
  A -->|as_u32_ids_1d| C["u32 ids<br/>checked bounds"]
  B -->|finite check| D["Index methods<br/>Rank/RankQuant/Bitmap/SignBitmap"]
  C -->|range check| D
  E["Tag vX.Y.Z"] -->|guard| F["SemVer + CI green"]
  F -->|build| G["crate + wheels + sdist"]
  G -->|attest| H["GitHub attestations<br/>+ .sigstore.json"]
  G -->|provenance| I["SLSA .intoto.jsonl"]
  H -->|release-assets| J["GitHub Release<br/>un-draft"]
  I -->|release-assets| J
  J -->|gated| K["publish-crate<br/>publish-pypi"]

Loading

Grey Divider

File Changes

1. ordvec-python/src/lib.rs ✨ Enhancement +315/-92

Add dtype coercion helpers and refactor FFI boundaries

ordvec-python/src/lib.rs


2. ordvec-python/tests/test_input_dtype.py 🧪 Tests +189/-0

New comprehensive dtype/layout acceptance and rejection tests

ordvec-python/tests/test_input_dtype.py


3. ordvec-python/tests/test_input_guards.py 🧪 Tests +122/-0

Add candidate/doc-id dtype conversion and edge-case tests

ordvec-python/tests/test_input_guards.py


View more (9)
4. ordvec-python/tests/test_redteam_fuzz.py 🧪 Tests +41/-8

Update dtype rejection lists for new coercion contract

ordvec-python/tests/test_redteam_fuzz.py


5. ordvec-python/tests/test_rank_quant.py 🧪 Tests +15/-7

Flip float64 rejection test to coercion acceptance test

ordvec-python/tests/test_rank_quant.py


6. ordvec-python/tests/test_bitmap.py 🧪 Tests +11/-5

Flip float64 rejection test to coercion acceptance test

ordvec-python/tests/test_bitmap.py


7. ordvec-python/tests/test_rank.py 🧪 Tests +13/-7

Flip float64 rejection test to coercion acceptance test

ordvec-python/tests/test_rank.py


8. ordvec-python/tests/test_sign_bitmap.py 🧪 Tests +13/-5

Flip float64 rejection test to coercion acceptance test

ordvec-python/tests/test_sign_bitmap.py


9. .github/workflows/release.yml ⚙️ Configuration changes +491/-0

New unified tag-triggered release pipeline with SLSA provenance

.github/workflows/release.yml


10. .github/workflows/changelog.yml Additional files +0/-90

...

.github/workflows/changelog.yml


11. .github/workflows/release-crate.yml Additional files +0/-158

...

.github/workflows/release-crate.yml


12. .github/workflows/release-python.yml Additional files +0/-226

...

.github/workflows/release-python.yml


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented May 28, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Release guard CI breaks ✓ Resolved 🐞 Bug ≡ Correctness
Description
ci.yml runs tests/release_publish_invariants.sh, which hard-codes
.github/workflows/release-python.yml and looks for an explicit *.cdx.json delete command in the
PyPI publish job; this PR replaces that workflow with release.yml whose cleanup doesn’t match the
script’s pattern. As a result, the release-guard CI job will fail closed and block merges.
Code

.github/workflows/release.yml[R479-487]

Evidence
CI will always execute the invariant script, and the script fails if the target workflow file is
missing or if it cannot find a matching cleanup command. The new unified workflow’s publish step
uses a different cleanup command that the script does not recognize.

.github/workflows/ci.yml[165-189]
tests/release_publish_invariants.sh[26-83]
.github/workflows/release.yml[463-491]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `release-guard` job in `.github/workflows/ci.yml` invokes `tests/release_publish_invariants.sh`, but that script is coupled to the deleted `release-python.yml` and to an older PyPI cleanup approach (explicit `*.cdx.json` deletion). The new unified `release.yml` uses a different publish job name and a broader cleanup command, so the invariant check will fail.

## Issue Context
- CI currently validates release-publish invariants on every push/PR.
- The invariant script must be updated to inspect `.github/workflows/release.yml` and to accept the new cleanup behavior (`find dist -type f ! -name '*.whl' ! -name '*.tar.gz' -delete`) as satisfying the “no SBOM in dist upload dir” constraint.

## Fix Focus Areas
- tests/release_publish_invariants.sh[5-83]
- .github/workflows/ci.yml[165-188]
- .github/workflows/release.yml[463-491]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Release concurrency not global ✓ Resolved 🐞 Bug ☼ Reliability
Description
The workflow claims it “serializes releases” but sets concurrency.group to `release-${{ github.ref
}}`, which is unique per tag and therefore allows multiple tag releases to run concurrently. This
defeats the stated “one release at a time” intent and can create overlapping environment gates and
parallel release assembly.
Code

.github/workflows/release.yml[R48-51]

Evidence
The concurrency comment states releases should be serialized, but using github.ref in the group
key creates different groups for different tags, allowing concurrent runs.

.github/workflows/release.yml[48-51]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`release.yml` intends to serialize releases, but its concurrency group is tag-scoped (`release-${{ github.ref }}`), so different tags can run in parallel.

## Issue Context
If the intent is to ensure only one release pipeline runs at a time (as the comment states), the concurrency group should be global (e.g., `${{ github.workflow }}` or a constant like `release`).

## Fix Focus Areas
- .github/workflows/release.yml[48-51]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Release docs reference deleted flows ✓ Resolved 🐞 Bug ⚙ Maintainability
Description
Documentation still describes changelog.yml / release-crate.yml / release-python.yml and
dispatch-only releases, but this PR replaces them with a tag-triggered release.yml. This will
mislead maintainers about the correct release trigger and setup steps (e.g., which workflow name
Trusted Publishing should reference).
Code

.github/workflows/release.yml[R9-14]

Evidence
These docs explicitly reference the old workflow filenames/trigger model, while the new workflow
itself states it replaces the old three, making the documentation inconsistent with the codebase
after this PR.

.github/workflows/release.yml[9-14]
RELEASING.md[8-29]
CONTRIBUTING.md[89-92]
THREAT_MODEL.md[242-250]
cliff.toml[7-11]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Multiple docs/config notes still reference the deleted `changelog.yml`, `release-crate.yml`, and `release-python.yml` workflows and describe dispatch-only release behavior. After this PR, the unified tag-triggered `release.yml` is the release mechanism.

## Issue Context
Leaving these references in place will cause incorrect operational behavior (following the wrong checklist / searching for non-existent workflow files).

## Fix Focus Areas
- RELEASING.md[8-29]
- CONTRIBUTING.md[83-102]
- THREAT_MODEL.md[242-250]
- cliff.toml[7-11]
- .github/workflows/python.yml[8-14]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@codecov

codecov Bot commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.68508% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
ordvec-python/src/lib.rs 96.68% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Python FFI boundary for the ordvec library to automatically coerce and validate input NumPy arrays. Specifically, it introduces helper functions to coerce float16, float32, and float64 arrays to C-contiguous float32 arrays, and to safely convert any integer dtype candidate or document ID array to u32 with checked bounds. It also adds robust error handling to raise appropriate Python exceptions (such as TypeError, ValueError, and IndexError) for invalid inputs, and updates the index classes (Rank, RankQuant, Bitmap, and SignBitmap) to use these new helpers. Comprehensive unit and integration tests have been added to verify these coercion and validation behaviors. I have no feedback to provide as there are no review comments.

Comment thread .github/workflows/release.yml
Fieldnote-Echo and others added 2 commits May 28, 2026 09:03
…zizmor cache-poisoning

Addresses all three failing checks on PR #91 + all three qodo bugs:

* actionlint (SC2035, x3): sha256sum *.crate *.whl *.tar.gz could be tricked by a
  hostile filename starting with '-'. Use ./*.glob form.
* zizmor (HIGH, cache-poisoning): sccache: 'true' on maturin-action in a release
  workflow allows a poisoned cache to inject code into the shipped wheel. Disable
  sccache on the release path (python.yml keeps it for the PR/main cadence).
* tests/release_publish_invariants.sh: was coupled to the deleted release-python.yml
  and the explicit '*.cdx.json' delete pattern. Re-point at release.yml, the
  publish-pypi job, and accept the new keep-only-wheels/tar.gz cleanup form
  (qodo bug 1).
* release.yml concurrency group: 'release-${{ github.ref }}' was per-tag and so
  allowed multiple tag pipelines to run concurrently — contrary to the 'serialize
  releases' comment. Use a constant 'release' group for true global serialization
  (qodo bug 2).
* Docs / comments referring to deleted workflows: RELEASING.md (significantly
  rewritten for the tag-triggered + gated-publish model), CONTRIBUTING.md,
  THREAT_MODEL.md, cliff.toml, .github/workflows/python.yml. python.yml's stale
  'Intel wheel is still built + shipped' claim corrected (no Intel wheel ships —
  issue #29) (qodo bug 3).

Locally: actionlint clean, zizmor clean ('No findings to report'),
tests/release_publish_invariants.sh OK.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…codex stop-gate)

Codex flagged that the env-policy I'd documented ("Deployment branches and
tags = main-only") was inherited from the old workflow_dispatch model and
would now DEADLOCK the new tag-triggered publish: the workflow runs on
`refs/tags/vX.Y.Z`, not `refs/heads/main`, so a branch-only allowlist refuses
every tag-triggered run at the environment gate.

Update RELEASING.md, THREAT_MODEL.md, and the release.yml header comment to
specify the correct env policy: "Selected branches and tags" with a single
TAG pattern `v[0-9]*.[0-9]*.[0-9]*`. The "tag must come from main" guarantee
is preserved by `require-ci-green` (queries `?branch=main&status=success` for
the SHA, which only returns a hit if the SHA was pushed to main) plus branch
protection on `main`. An optional tag ruleset can be added as defence in
depth.

Doc-only change; actionlint still clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…codex stop-gate)

Two stale spots survived the env-policy fix:

* THREAT_MODEL.md THREAT-SUPPLY-001 still described the env policy as
  "restrict deployment to the main branch only" — the branch-only policy that
  would deadlock the new tag-triggered workflow. Rewritten to specify the tag
  pattern and the require-ci-green + branch-protection chain that preserves
  the "must come from main" guarantee. Also adjusted the social-engineering
  residual ("dispatcher and approver" -> "cuts the release tag and approves
  both publishes") to match the new model.

* .github/workflows/ci.yml release-guard preamble still called the release
  workflows "workflow_dispatch-only." Updated to describe release.yml's
  tag-triggered + Environment-gated model.

Doc-only / comments-only; no YAML semantic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces three independent release workflows (changelog.yml, release-crate.yml, release-python.yml) with a single tag-triggered release.yml pipeline that automates build → attest → SLSA provenance → Release-asset attach → un-draft, with only the two registry publishes (crates.io, PyPI) gated behind GitHub Environments with required reviewers. It also broadens the Python FFI binding to accept float16/float32/float64 embeddings (coerced to f32 at the boundary) and any integer dtype for candidate/doc IDs (checked-conversion to u32), with comprehensive new tests covering the updated dtype/layout contract.

Changes:

  • Add unified release.yml (tag-triggered, SemVer-guarded, with actions/attest-build-provenance + slsa-github-generator providing .intoto.jsonl and .sigstore.json Release assets) and delete the three old workflows.
  • Broaden Python FFI: new helpers as_f32_1d/as_f32_2d (coerce f16/f64→f32, enforce C-contiguity before coercion, finite-check after) and as_u32_ids_1d/check_ids_in_range (accept any integer dtype with checked conversion).
  • Update tests/release_publish_invariants.sh and supporting docs (RELEASING.md, THREAT_MODEL.md, CONTRIBUTING.md, cliff.toml) to match the new tag-triggered model, and replace *_rejected float64 tests with *_coerced equivalents.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
.github/workflows/release.yml New unified tag-triggered release pipeline with SLSA L3 provenance and gated env publishes.
.github/workflows/release-crate.yml, release-python.yml, changelog.yml Deleted; functionality folded into release.yml.
.github/workflows/ci.yml, python.yml Comment updates pointing to the new workflow filename.
tests/release_publish_invariants.sh Retargeted to release.yml/publish-pypi job and accepts a keep-only-wheels cleanup form.
ordvec-python/src/lib.rs New as_f32_1d/as_f32_2d/as_u32_ids_1d boundary helpers; all embedding entry points and ID-array params switched to &Bound<PyAny> and route through them.
ordvec-python/tests/test_input_dtype.py New module documenting the f16/f32/f64-accepted, non-float-rejected, contiguity-before-coercion, finite-after-coercion contract.
ordvec-python/tests/test_input_guards.py New broad integer-dtype coverage for candidate/doc IDs, including overflow/negative/out-of-corpus rejection.
ordvec-python/tests/test_redteam_fuzz.py Remove f16/f32/f64 from rejected-dtype list; add _NON_INTEGER_ID_DTYPES and int-dtype-by-value tests.
ordvec-python/tests/test_rank.py, test_rank_quant.py, test_bitmap.py, test_sign_bitmap.py Replace test_add_float64_is_rejected with test_add_float64_is_coerced.
RELEASING.md, THREAT_MODEL.md, CONTRIBUTING.md, cliff.toml Update release docs to describe the tag-triggered unified pipeline and tag-pattern environment policies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Fieldnote-Echo and others added 3 commits May 28, 2026 09:36
…ate publish (grumpy blockers)

Addresses both grumpy blockers on PR #91:

BLOCKER 1: the GitHub Release was un-drafted as part of the asset-attach step,
BEFORE the registry publishes. A failed crates.io / PyPI publish would leave
a public GitHub Release pointing at a version the registries refused —
exactly the half-published "coordinated release" mode RELEASING.md says to
avoid.

  Split `release-assets` into:
    * `release-assets-draft` — uploads to the DRAFT release; DOES NOT un-draft.
    * `publish-github-release` — `needs: [publish-crate, publish-pypi]`; the
      SOLE un-draft point. If either registry publish fails, the Release
      stays DRAFT until the failure is investigated / re-run.
  publish-crate and publish-pypi now `needs: release-assets-draft` (the draft-
  assets edge transitively carries the attest + provenance fail-closed gate).

BLOCKER 2: publish-crate ran `cargo publish` on a fresh checkout — the .crate
it uploaded was not proven to match the SLSA-attested artifact `build-crate`
produced (toolchain drift / non-determinism could quietly diverge).

  Add a byte-identity gate to publish-crate:
    1. Download the attested `dist-crate` artifact.
    2. Re-package with `cargo package -p ordvec --locked`.
    3. sha256-compare the repackaged .crate to the attested .crate.
    4. Only then mint the crates.io OIDC token and `cargo publish`.
  A mismatch fails closed BEFORE the token is minted — nothing reaches
  crates.io. publish-pypi already uploads the exact built wheels/sdist via
  pypa/gh-action-pypi-publish, so it has byte-identity by construction.

NEW: tests/release_signed_release_invariants.sh — Grumpy's "anti-Claude
regression guard." Structural lint over release.yml asserting the signed-
release graph: release-assets-draft needs attest+provenance+require-ci-green,
uploads .crate/.whl/.tar.gz/.sigstore.json/.intoto.jsonl, does NOT un-draft;
SLSA generator tag-pinned with upload-assets:false and `*.intoto.jsonl`
provenance-name; attest grants id-token+attestations:write; publish-* grant
id-token:write and need the draft assets; publish-crate does the byte-identity
check (download-artifact dist-crate + cargo package + sha256sum); publish-
github-release needs BOTH publishes and is the sole un-draft point. Wired
into ci.yml's release-guard so a future commit can't silently dismantle the
chain.

RELEASING.md flow description updated for the new sequence (stage on DRAFT,
gated publishes, un-draft only after both succeed) + the byte-identity check.

Locally: actionlint clean, zizmor clean ("No findings to report"), both
invariant scripts OK.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…ed (codex stop-gate)

Codex correctly flagged that the previous byte-identity check only proved
`cargo package`'s output matches the attested .crate — but `cargo publish`
runs its OWN internal packaging step before uploading, which the pre-publish
gate cannot inspect. Determinism makes those bytes equal in practice, but
"must be" is not "is."

Add the empirical post-publish proof to publish-crate: after `cargo publish`
succeeds, curl the just-published .crate from
`https://crates.io/api/v1/crates/ordvec/<v>/download` (with a 60s retry
window for CDN propagation) and sha256-compare to the attested artifact.

* If the bytes crates.io serves equal the SLSA-attested bytes -> the
  version on crates.io IS the artifact the provenance covers (the
  byte-identity claim is empirically verified, not just assumed).
* If they differ -> publish-crate fails closed. The version is on
  crates.io (yank-only) but publish-github-release will NEVER un-draft
  the Release, and the mismatch is loudly audit-logged.

The pre-publish gate stays as a fast-fail before the OIDC token is even
minted; the post-publish step is the actual proof. Together they cover
both sides of `cargo publish`'s internal packaging.

Invariants script (release_signed_release_invariants.sh) updated to
require the post-publish curl + sha256 step exists in publish-crate;
RELEASING.md describes both sides of the proof. release.yml header
expanded accordingly.

Locally: actionlint clean, zizmor clean, both invariant scripts OK.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…d-2)

Grumpy's round-2 nits flagged two workflow comments still referencing the
old (pre-split) `release-assets` job name + un-draft timing:

* `notes` job: "draft so artifacts get assembled before release-assets
  un-drafts" -> rewritten to describe the new sequence (release-assets-draft
  stages, publishes run gated, publish-github-release un-drafts only after
  both succeed).
* `provenance` job: "release-assets is the single owner of all Release
  uploads" -> renamed to release-assets-draft.

Pure comment cleanup; no YAML semantic change. actionlint clean, zizmor
clean, both invariant scripts OK.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@project-navi-bot project-navi-bot merged commit 1cb55b1 into main May 28, 2026
32 checks passed
@project-navi-bot project-navi-bot deleted the ci/unified-release-provenance branch May 28, 2026 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants