Robust output-file write for nboot audit --output
Problem
audit_cmd writes the rendered report to --output via a single
output.write_text(rendered) call with no error handling and no
atomic-write semantics. Failure modes that surface as raw tracebacks
in CI:
PermissionError — read-only mount, restricted directory
OSError(ENOSPC) — disk full
OSError(EROFS) — read-only filesystem (containers)
- Process killed mid-write — leaves a partial / truncated file that
downstream consumers (upload-sarif action, log aggregators) try
to parse and fail on
Combined with issue #1 (directory rejection), these are the most
common real-world --output failure modes.
Proposed fix
Wrap the write in:
- Atomic write: write to a sibling temp file in the same directory
(so the rename is on the same filesystem), then os.replace.
- Try/except OSError: catch and re-raise as
click.ClickException
with a clean message naming the destination and underlying errno
reason.
Sketch (note: tmp_path is initialised to None before the
try so the cleanup path is unconditional and never tries to
reference an unbound name if tempfile.NamedTemporaryFile itself
raises):
import os, tempfile
tmp_path: Path | None = None
try:
out_dir = output.parent
out_dir.mkdir(parents=True, exist_ok=True)
with tempfile.NamedTemporaryFile(
mode="w", dir=out_dir, delete=False, prefix=output.name + ".",
suffix=".tmp",
) as tmp:
tmp_path = Path(tmp.name)
tmp.write(rendered)
os.replace(tmp_path, output)
tmp_path = None # ownership transferred to `output`
except OSError as e:
raise click.ClickException(
f"Failed to write audit report to {output}: {e.strerror or e}"
) from e
finally:
if tmp_path is not None and tmp_path.exists():
tmp_path.unlink(missing_ok=True)
(os.fsync is intentionally NOT used — durability beyond rename
isn't part of the audit's contract; per-page fsync would only
matter if we promised "report survives a crash before next boot",
which we don't.)
Acceptance criteria
Effort
~30-45 minutes including tests.
Source
Bot review on PR #51 (Grippy cli.py:379 MEDIUM, cli.py:422 LOW).
Robust output-file write for
nboot audit --outputProblem
audit_cmdwrites the rendered report to--outputvia a singleoutput.write_text(rendered)call with no error handling and noatomic-write semantics. Failure modes that surface as raw tracebacks
in CI:
PermissionError— read-only mount, restricted directoryOSError(ENOSPC)— disk fullOSError(EROFS)— read-only filesystem (containers)downstream consumers (
upload-sarifaction, log aggregators) tryto parse and fail on
Combined with issue #1 (directory rejection), these are the most
common real-world
--outputfailure modes.Proposed fix
Wrap the write in:
(so the rename is on the same filesystem), then
os.replace.click.ClickExceptionwith a clean message naming the destination and underlying errno
reason.
Sketch (note:
tmp_pathis initialised toNonebefore thetryso the cleanup path is unconditional and never tries toreference an unbound name if
tempfile.NamedTemporaryFileitselfraises):
(
os.fsyncis intentionally NOT used — durability beyond renameisn't part of the audit's contract; per-page fsync would only
matter if we promised "report survives a crash before next boot",
which we don't.)
Acceptance criteria
--outputproduces a one-lineClickException(not a traceback).--output <missing-dir>/report.sarifeither creates theparent directory (via
mkdir(parents=True, exist_ok=True))or, if creation itself fails, emits a clean ClickException
naming the missing parent.
os.replace()itself raises (rare; cross-device ordestination-locked scenarios), the temp file is unlinked
and a clean ClickException is raised — no
.tmpdebrisand no partial overwrite.
fully-written or untouched (atomic via
os.replace).*.tmpfiles in the destination directory inany failure path (success path or any OSError path).
tests/test_audit.pycover: OSError ontempfile.NamedTemporaryFile, OSError ontmp.write,OSError on
os.replace, and the success path; each assertsno
*.tmpremains.Effort
~30-45 minutes including tests.
Source
Bot review on PR #51 (Grippy
cli.py:379MEDIUM,cli.py:422LOW).