Skip to content

audit: add --uri-prefix flag for fleet SARIF aggregation #56

@Fieldnote-Echo

Description

Add --uri-prefix flag for fleet SARIF aggregation

Problem

nboot audit --format sarif emits SARIF results whose
artifactLocation.uri is a path relative to --target:

"artifactLocation": { "uri": ".github/workflows/codeql.yml" }

This is correct for a single-repo audit, but breaks down when
auditing multiple repos and aggregating findings centrally — every
repo's README.md finding has the same URI, making it impossible
to tell which repo owns which finding from the SARIF alone.

This blocks the fleet conformance workflow that nboot audit
is positioned for (the headline use case in docs/reference/audit.md):

Fleet surveys — "which of our 100 repos still conform to the
security-scanning pack?"

Proposed fix

Add --uri-prefix STRING to audit_cmd. When set, every SARIF
artifactLocation.uri is prefixed (with / separator) so findings
disambiguate cleanly:

# In a fleet-runner script:
for repo in $(gh repo list myorg --json name -q '.[].name'); do
  gh repo clone myorg/$repo /tmp/$repo
  nboot audit --spec spec.json --pack security \
    --target /tmp/$repo \
    --format sarif \
    --uri-prefix "$repo/" \
    --output /tmp/aggregate.sarif.d/$repo.sarif
done

The aggregated dashboard now distinguishes
repo-a/.github/workflows/codeql.yml from
repo-b/.github/workflows/codeql.yml.

Implementation sketch

  1. Add CLI flag in audit_cmd (default empty string → no prefix).
  2. Plumb through to findings_to_sarif() as a kwarg.
  3. Apply the prefix to artifact_uri BEFORE the SarifResult is
    constructed
    , so the fingerprint hash naturally distinguishes
    findings across repos. Don't apply it only in to_dict(), or
    partialFingerprints.primaryLocationLineHash will collide
    across repos and GitHub will dedupe legitimately-different
    findings into one.
  4. Scope: SARIF only. No effect on --format text, no effect
    on nboot diff's unified-diff text output, no effect on
    AuditFinding.message (which is for humans, not URIs).

Acceptance criteria

  • --uri-prefix myorg/foo/ appears in every SARIF result's
    artifactLocation.uri.
  • Trailing-slash handling: missing one is added; double slashes
    are collapsed.
  • Empty/unset --uri-prefix matches today's behaviour exactly
    (no prefix in artifact_uri, fingerprint identical to today).
  • partialFingerprints.primaryLocationLineHash differs for
    same-rule-and-relative-path findings under different prefixes —
    i.e., the prefix is incorporated into the fingerprint hash, not
    added only in the rendered output.
  • --format text output is unchanged regardless of --uri-prefix.
  • nboot diff is untouched (no new flag, no behaviour change).
  • Tests in tests/test_audit.py and tests/test_sarif.py,
    including a fingerprint-collision test (same rule + same
    relative path + different prefix → different fingerprint).
  • Example added to docs/reference/audit.md showing the fleet
    for repo in …; do …; done pattern.

Effort

~1 hour including tests and doc.

Source

Bot review on PR #51 (Grippy audit.py:127 MEDIUM).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions