Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions RELEASES.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,14 @@ Branching from `dev` and then `gio trash`-ing the guarded paths seems simpler bu
whenever `dev` and `main` have diverged. The file appears as "added" on both sides with different content. Always branch
from `origin/main` and cherry-pick onto it.

## Spec re-vendoring

The bundle vendors a snapshot of [`agentnative-spec`](https://github.com/brettdavies/agentnative) under `spec/`. When
the spec ships a new tag (e.g., `v0.3.0`), this skill re-vendors via `scripts/sync-spec.sh` on the `release/v<X.Y.Z>`
branch — same commit as the version bump, message `chore(spec): re-vendor spec to <version>`. The script auto-resolves
the latest upstream tag from the remote, so no manual version selection is needed. Without re-vendoring, the bundle
ships stale spec content while consumers see the new version on `anc.dev`.

## Version bump procedure

The version bump and CHANGELOG generation both happen on the `release/v<X.Y.Z>` branch (steps 5–6 of the cherry-pick
Expand Down
26 changes: 25 additions & 1 deletion spec/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,30 @@
# Changelog

All notable changes to the agent-native CLI standard are documented here.
All notable changes to this repository are documented here — governance, validator, release infrastructure, README, decision records.

Changes to the standard itself — principle MUST/SHOULD/MAY tier moves, requirement IDs added/removed/renamed, applicability shifts — are tracked per-principle in `principles/p*-*.md` via the `last-revised:` calver frontmatter field and the `## Pressure test notes` section appended to each file.

## [0.3.0] - 2026-04-28

### Added

- `active` status value for principles, joining `draft | under-review | locked` as the default state for shipped principles. by @brettdavies in [#12](https://github.com/brettdavies/agentnative/pull/12)
- README `## Status` section and HN-scroller hook explaining the active-with-pressure-tests-welcome posture.

### Changed

- All 7 principles flipped from `status: draft` to `status: active` for the v0.3.0 release. by @brettdavies in [#12](https://github.com/brettdavies/agentnative/pull/12)
- `principles/AGENTS.md` pressure-test protocol updated for the new status lifecycle (`active` is the default; `under-review` is reserved for substantive critique cycles).
- README restructured for HN-visitor flow: new `## The trifecta` callout (spec + linter + leaderboard as equals); `## Quick start` lead with `brew install brettdavies/tap/agentnative`; `## Live leaderboard` preview table; admin sections (Versioning, Decision records, Related, Contributing, License) reordered below spec content. License section tightened from 4 paragraphs to 3 bullets; Contributing tightened from 4-bullet list to 1 paragraph. by @brettdavies in [#14](https://github.com/brettdavies/agentnative/pull/14)
- README leaderboard URLs corrected from bare `anc.dev` to `anc.dev/scorecards`.
- AGENTS.md adds `brettdavies/agentnative-skill` as a documented downstream consumer (introductory list + cross-repo context table); replaces the prior `~/.claude/skills/agent-native-cli/` row with the public `~/dev/agentnative-skill` row.

### Documentation

- G11 red-team pass on all 7 principle files via `compound-engineering:ce-adversarial-document-reviewer`. 25 findings: 11 prose edits applied (P1 TUI parenthetical, P2 sysexits acknowledgment, P4 dependency-gating cleanup, P5 `--dry-run` write-gate + retry hedge, P6 SIGPIPE language-neutral + global-flags behavioral lead, P7 LLM-vs-non-LLM cost generalization), 10 `[later]` notes appended for v0.4.0 follow-up, 2 `[wontfix]` notes, 2 skipped. No requirement IDs added/removed/renamed; no level/applicability changes; no `last-revised:` bumps; no VERSION bump triggered. by @brettdavies in [#13](https://github.com/brettdavies/agentnative/pull/13)
- Three summary-text tightenings (P4 `gating-before-network`, P6 `sigpipe`, P6 `global-flags`) introduce mild registry-readable drift documented in pressure-test notes for v0.4.0 follow-up.

**Full Changelog**: [v0.2.0...v0.3.0](https://github.com/brettdavies/agentnative/compare/v0.2.0...v0.3.0)

## [0.2.0] - 2026-04-23

Expand Down
2 changes: 1 addition & 1 deletion spec/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.2.0
0.3.0
32 changes: 28 additions & 4 deletions spec/principles/p1-non-interactive-by-default.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: p1
title: Non-Interactive by Default
last-revised: 2026-04-22
status: draft
status: active
requirements:
- id: p1-must-env-var
level: must
Expand Down Expand Up @@ -64,8 +64,9 @@ agent-tool deadlock.
```

- A `--no-interactive` flag gating every prompt library call (`dialoguer`, `inquire`, `read_line`, `TTY::Prompt`,
`inquirer`, equivalents in other frameworks). When the flag is set, or when stdin is not a TTY, the tool uses
defaults, reads from stdin, or exits with an actionable error. It never blocks.
`inquirer`, equivalents in other frameworks, or any TUI event loop that takes over the terminal). When the flag is
set, or when stdin is not a TTY, the tool uses defaults, reads from stdin, or exits with an actionable error. It never
blocks.
- A headless authentication path if the CLI authenticates. The canonical flag is `--no-browser`, which triggers the
OAuth 2.0 Device Authorization Grant ([RFC 8628](https://www.rfc-editor.org/rfc/rfc8628)): the CLI prints a URL and a
code; the user authorizes on another device. Agents cannot open browsers. Non-canonical alternatives (`--device-code`,
Expand Down Expand Up @@ -103,4 +104,27 @@ agent-tool deadlock.
- OAuth flow that unconditionally opens a browser with no headless escape hatch.

Measured by check IDs `p1-non-interactive` (behavioral) and `p1-non-interactive-source` (source). Run `agentnative check
--principle 1 .` against your CLI to see both.
--principle 1 .` against your CLI to see both.

## Pressure test notes

### 2026-04-27 — Show HN launch red-team pass

Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings
recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol".

- **[edit]** *Internal inconsistency.* "The `--no-interactive` MUST bullet says 'uses defaults, reads from stdin, or
exits with an actionable error' but the principle's behavioral framing (per decision record) covers TUI session init
too. The prose bullet only enumerates prompt libraries (`dialoguer`, `inquire`, `read_line`, `TTY::Prompt`,
`inquirer`), not TUI frameworks (`ratatui`, `bubbletea`) — readers will infer the MUST excludes TUIs, contradicting
the decision record's explicit 'blocking-interactive surface includes... TUI session initialization.'" Resolved: prose
bullet's parenthetical now includes "or any TUI event loop that takes over the terminal." Mirrors the behavioral
framing in [`docs/decisions/p1-behavioral-must.md`](../docs/decisions/p1-behavioral-must.md). No frontmatter change;
the summary already says "gates every prompt library call" and stays.
- **[wontfix]** *Prior art.* "RFC 8628 citation is correct in name but incomplete in framing. The prose says Device
Authorization Grant means 'the CLI prints a URL and a code; the user authorizes on another device' — this still
requires a human on another device, which an unattended agent does not have. An HN commenter will note this is a
*human-assisted* headless path, not an agent-headless path; true unattended agents need service-account / API-token
auth (which the principle doesn't mention)." Rationale: P1 scopes "headless" as "no local browser required," not "no
human anywhere." A human-in-the-loop is acceptable here. Service-account auth is orthogonal and would belong to a
separate principle if it were added.
42 changes: 32 additions & 10 deletions spec/principles/p2-structured-parseable-output.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: p2
title: Structured, Parseable Output
last-revised: 2026-04-22
status: draft
status: active
requirements:
- id: p2-must-output-flag
level: must
Expand Down Expand Up @@ -60,13 +60,16 @@ catastrophically later.
must never encounter an interleaved progress message.
- Exit codes are structured and documented:

| Code | Meaning |
| ---: | -------------------------------------------- |
| 0 | Success |
| 1 | General command error |
| 2 | Usage error (bad arguments) |
| 77 | Authentication / permission error |
| 78 | Configuration error |
| Code | Meaning |
| ---: | --------------------------------- |
| 0 | Success |
| 1 | General command error |
| 2 | Usage error (bad arguments) |
| 77 | Authentication / permission error |
| 78 | Configuration error |

These codes blend the bash 0/1/2 convention with BSD `sysexits.h` 77/78 (`EX_NOPERM`, `EX_CONFIG`); the result is the
de-facto agent-facing dialect, not strict `sysexits.h` compliance.

- When `--output json` is active, errors are emitted as JSON (to stderr) with at least `error`, `kind`, and `message`
fields. Plain-text errors in a JSON run break the agent's parser on the only output it was told to expect.
Expand Down Expand Up @@ -98,5 +101,24 @@ catastrophically later.
- `process::exit()` in library code, bypassing structured error propagation.
- Human-formatted tables as the only output mode with no JSON alternative.

Measured by check IDs `p2-output-json`, `p2-output-format`, `p2-stderr-diagnostics`. Run
`agentnative check --principle 2 .` against your CLI to see each.
Measured by check IDs `p2-output-json`, `p2-output-format`, `p2-stderr-diagnostics`. Run `agentnative check --principle
2 .` against your CLI to see each.

## Pressure test notes

### 2026-04-27 — Show HN launch red-team pass

Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings
recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol".

- **[edit]** *Prior art.* "The exit-code table conflicts with `sysexits.h`. `EX_NOPERM=77` is 'permission denied'
(close), but `EX_CONFIG=78` is correct. However, `sysexits.h` reserves `EX_USAGE=64`, `EX_DATAERR=65`,
`EX_NOINPUT=66`, `EX_UNAVAILABLE=69`, `EX_SOFTWARE=70` — P2 puts 'usage error' at 2 (bash convention), not 64. HN will
note the principle straddles two conventions (bash 0/1/2 + sysexits 77/78) without naming the hybrid." Resolved: added
one sentence under the exit-code table acknowledging the bash + `sysexits.h` blend. The same citation now appears in
P4's exit-code table (per Row #13 of the same review pass) so both files agree.
- **[later]** *Must-vs-should.* "A single-number-emitting CLI (e.g., `epoch`, `uuidgen`) plausibly violates the
`--output text|json|jsonl` MUST for a defensible reason. Universal applicability is a strong claim." Deferred: revisit
whether `applicability` should soften when the launch landscape clarifies actual single-number agent-facing CLIs. The
applicability change would fire coupled-release (CLI registry impact), so it is held for a v0.4.0 cleanup PR rather
than churned during launch week.
31 changes: 28 additions & 3 deletions spec/principles/p3-progressive-help-discovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: p3
title: Progressive Help Discovery
last-revised: 2026-04-22
status: draft
status: active
requirements:
- id: p3-must-subcommand-examples
level: must
Expand Down Expand Up @@ -77,5 +77,30 @@ trial-and-errors its way into a working call, burning tokens and sometimes landi
- Examples buried in a README or man page but absent from `--help` output.
- `after_help` text that describes the flags in prose instead of demonstrating them in code.

Measured by check IDs `p3-help`, `p3-after-help`, `p3-version`. Run `agentnative check --principle 3 .` against
your CLI to see each.
Measured by check IDs `p3-help`, `p3-after-help`, `p3-version`. Run `agentnative check --principle 3 .` against your CLI
to see each.

## Pressure test notes

### 2026-04-27 — Show HN launch red-team pass

Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings
recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol".

- **[later]** *Internal inconsistency.* "The SHOULD on paired examples ('text then `--output json` equivalent')
implicitly gates P3 conformance on a `--output json` mode, which is P2's territory. A CLI without structured output
cannot satisfy this SHOULD even in spirit, yet `applicability: universal`." Deferred: narrowing `applicability` from
`universal` to conditional (`if: CLI exposes a structured-output mode`) fires the coupled-release norm (CLI registry
parses `applicability`). Bundled with other applicability cleanups for a v0.4.0 PR with explicit registry
coordination.
- **[later]** *Must-vs-should.* "'Top-level command ships 2–3 examples' as a universal MUST is too strong for genuinely
single-purpose CLIs (e.g., `cat`, `true`, a one-shot wrapper) where one canonical invocation is the entire surface.
The '2–3' count baked into a MUST will draw HN fire as cargo-culted." Deferred: softening to "at least one example,
and 2–3 when the tool has multiple primary use cases" is a MUST-content change that drifts the frontmatter summary.
Bundled with other MUST-content softenings for a v0.4.0 PR.
- **[later]** *Prior art.* "Principle is clap-flavored throughout (`after_help`, `about`/`long_about`, `///` doc
comments in Anti-Patterns) without a single sentence acknowledging the non-Rust analog (docopt usage block, `argparse`
epilog, `cobra` Example field, `gh`/`kubectl` Examples convention). HN will call this 'a clap style guide, not a CLI
standard.'" Deferred: a cross-framework analog appendix is a meaningful addition. The Definition / Why-Agents-Need-It
sections are framework-agnostic; the Evidence section is intentionally clap-keyed. Worth revisiting in v0.4.0 once the
standard's multi-language reach is clearer; site copy may also be a better home than the principle file itself.
65 changes: 51 additions & 14 deletions spec/principles/p4-fail-fast-actionable-errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: p4
title: Fail Fast with Actionable Errors
last-revised: 2026-04-22
status: draft
status: active
requirements:
- id: p4-must-try-parse
level: must
Expand All @@ -24,7 +24,7 @@ requirements:
level: should
applicability:
if: CLI makes network calls
summary: Config and auth validation happen before any network call (three-tier dependency gating).
summary: Config and auth validation happen before any network call, failing at the earliest possible point.
- id: p4-should-json-error-output
level: should
applicability: universal
Expand Down Expand Up @@ -58,15 +58,19 @@ OAuth or asks the user to check their config file. Getting that wrong wastes ent
let cli = Cli::try_parse()?;
```

- Error types map to distinct exit codes. At minimum:
- Error types map to distinct exit codes. Use 77 when the CLI has an auth surface and 78 when it has a config surface;
0/1/2 are universal:

| Code | Meaning |
| ---: | ----------------------------- |
| 0 | Success |
| 1 | General command error |
| 2 | Usage / argument error |
| 77 | Auth / permission error |
| 78 | Configuration error |
| Code | Meaning |
| ---: | ----------------------- |
| 0 | Success |
| 1 | General command error |
| 2 | Usage / argument error |
| 77 | Auth / permission error |
| 78 | Configuration error |

These codes blend the bash 0/1/2 convention with BSD `sysexits.h` 77/78 (`EX_NOPERM`, `EX_CONFIG`); the result is the
de-facto agent-facing dialect, not strict `sysexits.h` compliance.

- Every error message contains **what failed**, **why**, and **what to do next**. Example:

Expand All @@ -79,8 +83,9 @@ OAuth or asks the user to check their config file. Getting that wrong wastes ent

- Error types use a structured enum (via `thiserror` in Rust) with variant-to-kind mapping for JSON serialization.
Agents match on error kinds programmatically rather than parsing message text.
- Config and auth validation happen before any network call. A three-tier dependency gating pattern (meta-commands,
local-only commands, network commands) fails at the earliest possible point.
- Config and auth validation happen before any network call, failing at the earliest possible point. The structural
three-tier definition (meta-commands, local-only commands, network commands) lives in P6 (`p6-should-tier-gating`);
this requirement specifies the network-call ordering consequence.
- Error output respects `--output json`: JSON-formatted errors go to stderr when JSON output is selected.

## Evidence
Expand All @@ -100,5 +105,37 @@ OAuth or asks the user to check their config file. Getting that wrong wastes ent
- Error messages that state the symptom without the cause or fix ("Error: request failed").
- Panics (`unwrap()`, `expect()`) on recoverable errors in production code paths.

Measured by check IDs `p4-bad-args`, `p4-process-exit`, `p4-unwrap`, `p4-exit-codes`. Run
`agentnative check --principle 4 .` against your CLI to see each.
Measured by check IDs `p4-bad-args`, `p4-process-exit`, `p4-unwrap`, `p4-exit-codes`. Run `agentnative check --principle
4 .` against your CLI to see each.

## Pressure test notes

### 2026-04-27 — Show HN launch red-team pass

Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings
recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol".

- **[edit]** *Internal inconsistency.* "Three-tier gating is labeled identically as a SHOULD in both P4
(`p4-should-gating-before-network`) and P6 (`p6-should-tier-gating`) — same pattern, two homes, no cross-reference.
Readers can't tell which is canonical, and a CLI that satisfies one auto-satisfies the other." Resolved: P4's bullet
now focuses on the network-call ordering consequence and points to P6 as the canonical home of the structural
three-tier definition. Frontmatter summary tightened to match. Requirement ID is unchanged so CLI registry pinning is
unaffected.
- **[edit]** *Must-vs-should.* "`p4-must-exit-code-mapping` is `applicability: universal` and the prose says 'At
minimum' 0/1/2/77/78 — but a CLI with no auth surface and no config file legitimately has nothing to assign to either
77 or 78, and the MUST forces empty-by-construction error variants. Same shape as P6, which correctly gates
`p6-must-timeout-network` behind `if: CLI makes network calls`." Resolved: prose now reads "Use 77 when the CLI has an
auth surface and 78 when it has a config surface; 0/1/2 are universal." Frontmatter summary stays universal because
the *mapping discipline* is universal even if the specific 77/78 codes are conditional. The summary-prose drift is a
known launch-week tradeoff; full alignment of the summary text is on the v0.4.0 punch list.

- **[edit]** *Prior art.* "77/78 align with BSD `sysexits.h` (`EX_NOPERM`, `EX_CONFIG`) — the alignment is a strength
but neither P2 nor P4 cites BSD sysexits, leaving an HN commenter to 'discover' it as a gotcha." Resolved: added a
one-liner under the P4 exit-code table acknowledging the `sysexits.h` alignment. Same sentence added to P2's exit-code
table for consistency.
- **[later]** *Must-vs-should.* "`p4-must-try-parse` names a clap-specific Rust API in a `applicability: universal`
MUST. A Go/Python/Node CLI has no `try_parse()`. The underlying requirement — 'argument-parse failures route through
the same error/output formatter as runtime errors, not a library-internal `process::exit()`' — is universal; the API
name is not." Deferred: language-neutralizing the bullet ("Argument parsing returns a structured error rather than
calling `process::exit()` internally; in Rust+clap, this means `try_parse()` not `parse()`") drifts the frontmatter
summary. Bundled with P6's SIGPIPE and `global = true` rewrites for a coordinated v0.4.0 language-neutralization PR.
Loading