outofrange-consulting · outofrange-consulting · Jun 25, 2026 · Jun 25, 2026 · Jun 25, 2026
diff --git a/.github/workflows/installers.yml b/.github/workflows/installers.yml
@@ -84,6 +84,17 @@ jobs:
           done
           exit $rc
 
+  # 1d) Framework-compliance checks for the dev-team plugin: skill:// anchor
+  #     resolution, index.json file integrity, review-agent output-discipline
+  #     wiring, and the test-after stance (no removed TDD identifiers).
+  compliance:
+    name: dev-team framework compliance
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Validate framework compliance
+        run: node scripts/ci-framework-compliance.mjs
+
   # 2) Real end-to-end install on every OS (bootstraps bun/omp/plugins, then
   #    verifies OMP launches and lists all plugins).
   e2e:

diff --git a/README.fr.md b/README.fr.md
@@ -8,7 +8,7 @@ met en place OMP et vous guide à travers chacun d'eux.
 
 | Plugin | Rôle |
 |---|---|
-| **[`dev-team`](plugins/dev-team/)** | **Équipe de dev agentique** — un orchestrateur + 32 agents spécialistes/critiques, le workflow `/specs` → `/plan` → `/build` → `/pr`, **TDD strict** et points de contrôle humains, ~78 skills, et des extensions « garde-fou » bloquantes. Portage de [bdfinst/agentic-dev-team](https://github.com/bdfinst/agentic-dev-team) (Bryan Finster). Tiers 100 % cloud ; gardez le tier « small » à haut volume bon marché. |
+| **[`dev-team`](plugins/dev-team/)** | **Équipe de dev agentique** — un orchestrateur + 32 agents spécialistes/critiques, le workflow `/specs` → `/plan` → `/build` → `/pr`, un **plan gate strict** (test-after, tests requis) et points de contrôle humains, ~78 skills, et des extensions « garde-fou » bloquantes. Portage de [bdfinst/agentic-dev-team](https://github.com/bdfinst/agentic-dev-team) (Bryan Finster). Tiers 100 % cloud ; gardez le tier « small » à haut volume bon marché. |
 | **[`copilot-preset`](plugins/copilot-preset/)** | **Préréglage modèles GitHub Copilot** — route OMP (et les tiers de dev-team) via `github-copilot` pour tourner sur une licence Copilot. Config seulement : mapping tier→modèle, comparatif tarifaire (crédits IA post-juin 2026), et MAI-Code-1-Flash câblé. |
 | **[`token-diet`](plugins/token-diet/)** | **Réduction agressive des tokens** — ctx-wire (compression transparente de la sortie des commandes + scrub des secrets), CodeGraph (requêtes de graphe de symboles via MCP au lieu de grep+read), un skill « caveman » de sortie laconique, et un skill « yagni » de code minimal — par-dessus la compaction/`astGrep` natives d'OMP. |
 | **[`azure-devops-fs`](plugins/azure-devops-fs/)** | **Azure DevOps comme un système de fichiers** — lecture repos/fichiers/PR/diffs via URIs `ado://` (paginé), **gates/policies** de PR + CI (builds/logs/run), création/checkout/push/complete de PR, commentaires/votes. Propulsé par l'**Azure CLI** (`az` + extension azure-devops), auth PAT, cache SQLite ; fonctionne derrière les proxys TLS d'entreprise. |

diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@ you through them.
 
 | Plugin | What it does |
 |---|---|
-| **[`dev-team`](plugins/dev-team/)** | **Agentic dev team** — orchestrator + 32 specialist/critic agents, the `/specs` → `/plan` → `/build` → `/pr` workflow, **strict TDD** and human gates, ~78 skills, and blocking guard extensions. Port of [bdfinst/agentic-dev-team](https://github.com/bdfinst/agentic-dev-team) (Bryan Finster). All-cloud tiers; keep the high-volume small tier cheap. |
+| **[`dev-team`](plugins/dev-team/)** | **Agentic dev team** — orchestrator + 32 specialist/critic agents, the `/specs` → `/plan` → `/build` → `/pr` workflow, a **forced plan gate** (test-after, tests required) and human gates, ~78 skills, and blocking guard extensions. Port of [bdfinst/agentic-dev-team](https://github.com/bdfinst/agentic-dev-team) (Bryan Finster). All-cloud tiers; keep the high-volume small tier cheap. |
 | **[`copilot-preset`](plugins/copilot-preset/)** | **GitHub Copilot model preset** — route OMP (and the dev-team tiers) through `github-copilot` to run on a Copilot license. Config-only: tier→model mapping, post-June-2026 AI-credit pricing comparison, and MAI-Code-1-Flash wired in. |
 | **[`token-diet`](plugins/token-diet/)** | **Aggressive token reduction** — ctx-wire (transparent command-output compression + secret scrub), CodeGraph (MCP symbol/call-graph queries instead of grep+read), a caveman terse-output skill, and a yagni minimal-code skill — layered on OMP's native compaction/`astGrep`. |
 | **[`azure-devops-fs`](plugins/azure-devops-fs/)** | **Azure DevOps as a filesystem** — read repos/files/PRs/diffs via `ado://` URIs (paginated), PR **gates/policies** + CI (builds/logs/run), create/checkout/push/complete PRs, comment/vote. Backed by the **Azure CLI** (`az` + the azure-devops extension), PAT auth, SQLite read cache; works behind corporate TLS proxies. |

diff --git a/REVIEW.md b/REVIEW.md
@@ -24,7 +24,7 @@ cohérence docs/registres — pas des cassures de build.**
 
 ### 1. Les « guards » de sécurité sont du théâtre de sécurité (le plus important)
 
-Les 6 guards de `dev-team` (destructive, path, freeze, tdd, review-gate,
+Les 6 guards de `dev-team` (destructive, path, freeze, spec, review-gate,
 careful) sont des hooks *PreToolUse consultatifs* basés sur du matching de
 sous-chaîne / glob. Ils donnent une **fausse confiance** : ils sont
 contournables trivialement, par accident comme volontairement.
@@ -34,7 +34,7 @@ contournables trivialement, par accident comme volontairement.
 | destructive-guard | `rm -rf`, drop/truncate, force-push, kill… | `find -delete`, `git clean -fdx`, `> f`, `truncate -s0`, `bash -c …`, obfuscation par variable ; **warn-only hors `/careful on`** ; la SAFE-list court-circuite tout (`rm -rf node_modules/../../etc`) | Très faible |
 | path-guard | édition de `.env`/`*.pem`/`*.key`/`id_rsa`… | aucune couverture `bash` (`tee`/`>`/`sed -i`) ; regex **sensible à la casse** → `ID_RSA`/`.PEM` passent ; lectures non gardées ; pass silencieux si le shape d'edit ne matche pas | Faible |
 | freeze-guard | écriture sur globs gelés | **aucune branche `bash`** ; écraser `.omp/state/freeze.json` suffit | Faible |
-| tdd-guard (.feature) | modif de specs BDD | `BASH_WRITE_RE` étroit (rate `python -c`/`ed`/heredoc) ; opt-out `OMP_ALLOW_FEATURE_EDITS=1` | Faible–moyenne |
+| spec-guard (.feature) | modif de specs BDD | `BASH_WRITE_RE` étroit (rate `python -c`/`ed`/heredoc) ; opt-out `OMP_ALLOW_FEATURE_EDITS=1` | Faible–moyenne |
 | review-gate | `git commit` avant approve | `--no-verify` **explicitement autorisé** ; `bash -c 'git commit'` ; le `--no-verify` est détecté par `includes` donc un message de commit le déclenche | Faible |
 | careful-mode | active le blocage | fichier d'état modifiable par l'agent ; **OFF par défaut** | Faible |
 

diff --git a/docs/upstream-v7.7-7.9-extraction.md b/docs/upstream-v7.7-7.9-extraction.md
@@ -0,0 +1,105 @@
+# Extraction from upstream agentic-dev-team (v7.7–v7.9)
+
+Survey of `bdfinst/agentic-dev-team` releases since our v7.6 extraction, and what
+we pulled into our OMP port — filtered through omp-dev-team's choices:
+**test-after with refactoring (no TDD), quality first, cost efficiency.**
+
+## Upstream evolution since v7.6
+
+| Release | Highlight |
+|---|---|
+| v7.7.0 | harness fixes from session-review/audit; **closed learning loop**; **when-tdd-pays** experiment fixtures; ambiguity-resolution protocol for `/specs` + `/ship` gate |
+| v7.8.0 | **craftsmanship-axis review rules** — use-the-platform, comment hygiene; named shipped AC references |
+| v7.9.0 | **deterministic status + finding grouping** for doc/naming review agents |
+
+## Extracted (respecting omp choices)
+
+1. **Deterministic status + finding grouping (v7.9).** New shared knowledge file
+   `skills/dev-team-knowledge/review-output-discipline.md`:
+   - **Deterministic status** — an agent's `status` is a pure function of the
+     highest-severity finding, never of volume.
+   - **Finding grouping** — Enumerate → Classify → Group; consolidate same-kind
+     findings into ~3–5 concept-level findings per file; keep `error` findings
+     individual.
+
+   Wired as a one-line anchored reference into **all 17** finding-emitting review
+   agents (upstream changed only doc/naming — we factored it into one shared file
+   instead of copy-pasting; this is the DRY, cost-efficient win and propagates
+   determinism + token savings to every review). Added to `index.json`.
+
+2. **Comment hygiene (v7.8) → `doc-review`.** Tracker-ID references in shipped
+   comments (`JIRA-123`, `#456`), detached/orphaned doc comments; **capped at
+   `suggestion`** (never raise status above `warn`); durable external standards
+   (`RFC-2119`, `ISO-4217`, CVE) are explicitly not flagged.
+
+3. **Use-the-platform (v7.8) → `refactor-opportunity-review`.** Reinvented
+   built-ins (`min`/`max`/`sum`/`clamp`/`copy`), reinvented helpers, open-coded
+   idioms repeated 3+ times — mapped by concept, honoring language **and version**
+   (e.g. Go <1.21 has no builtin `min`/`max`). Framing de-TDD'd to test-after.
+
+4. **Closed learning loop (v7.7) → `feedback-learning` skill.** We already had
+   post-task reflection + recurring-correction detection (3+); the missing
+   "closed" half was a persistent queue + disposition. Added a
+   `metrics/pending-review.jsonl` queue (system proposals enqueued, never
+   dropped) and a **session-review** disposition flow (`review` keyword) that
+   previews, then approves (apply + log + stamp `approved`) or rejects (stamp
+   `rejected`). Project-local only — plugin-cache-safe. Asynchronous/batched by
+   design, which keeps it cheap.
+
+## Test-after reinforcement (omp north star)
+
+Beyond extraction, a pass to make **test-after with refactoring** explicit and to
+remove residual TDD framing the earlier plan-gate-over-tdd move had left behind:
+
+- **Refactor after green, every step** — promoted from "optional" to a deliberate
+  always-taken pass (the `refactor-opportunity-review` lens) in `skills/build`,
+  `prompts/implementer.md`, and the orchestrator's Phase 3. Changes are made only
+  when there's a real opportunity, but the pass is always taken — the *refactoring*
+  half of test-after-with-refactoring.
+- **Residual TDD traces reframed** to test-after: `triage` skill + command (RED/GREEN
+  fix plan → regression-test + fix + refactor), `qa-engineer` (ATDD → acceptance
+  scenarios; unit tests follow, test-after), `plan` (TDD step/traceability →
+  build step / step-to-scenario), `progress-guardian` (flagged "tests not written
+  first" → flags missing tests, order-agnostic), `mutation-testing` /
+  `quality-gate-pipeline` / `init-dev-team` (RED-GREEN labels dropped, semantics
+  kept), `test-design-reviewer` ("First/written-first" rubric → "Timely/ships with
+  the implementation"), plus the root `README`/`README.fr` ("strict TDD" → forced
+  plan gate, test-after) and `REVIEW.md` (stale `tdd-guard` → `spec-guard`).
+
+## Deliberately NOT extracted (with rationale)
+
+- **when-tdd-pays experiment fixtures (v7.7).** Upstream is re-litigating where
+  test-first pays. omp-dev-team has already made the call (test-after + plan gate);
+  importing TDD experiment fixtures would reintroduce exactly what we removed.
+- **`/ship` gate + ambiguity-resolution protocol (v7.7).** The ambiguity protocol
+  is reasonable, but it is wired to a `/ship` command we don't have; our `/specs`
+  already runs a consistency gate. Candidate for a focused follow-up if a gap shows.
+- **Bibliographic TDD citations kept as-is.** e.g. `testability-patterns.md` cites
+  *Growing Object-Oriented Software, Guided by Tests* ("outside-in TDD") — that is
+  the book's actual subject; rewriting a citation would misrepresent the source.
+
+## Verified
+
+`ci-validate-json` 23/23 · all 10 dev-team extensions compile · unit suite green ·
+both `review-output-discipline` anchors (`#deterministic-status`,
+`#finding-grouping`) resolve from all 17 wired agents · no prescriptive TDD /
+test-first / RED-GREEN traces remain outside historical `docs/` and the one book
+citation.
+
+## CI: framework-compliance checks
+
+What earlier extraction PRs verified by hand is now enforced by CI —
+`scripts/ci-framework-compliance.mjs` (pure Node, new `compliance` job):
+
+- **Anchor resolution** — every `skill://dev-team-knowledge/<file>.md#<anchor>`
+  reference resolves (file exists; anchor matches a heading slug or a registered
+  `index.json` anchor). Catches exactly the rename/typo failure mode this work
+  risked across 17 agents.
+- **index.json integrity** — every keyed file exists.
+- **Review-agent wiring** — every finding-emitting agent references
+  `review-output-discipline.md` (allowlist: `progress-guardian`).
+- **Test-after stance** — the deliberately-removed TDD identifiers (`tdd-first`,
+  `tdd-guard`, `test-driven-development`, `RED-GREEN`) can't creep back in,
+  outside a small rationale/historical allowlist.
+
+Currently: 26 anchor refs + 39 index files checked — 0 violations.
diff --git a/plugins/dev-team/.claude-plugin/plugin.json b/plugins/dev-team/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "dev-team",
-  "version": "1.2.0",
+  "version": "1.3.0",
   "description": "Agentic dev team for Oh-My-Pi (ported from bdfinst/agentic-dev-team): orchestrator + 32 specialist/critic agents, the /specs -> /plan -> /build -> /pr workflow, a forced plan gate (scope -> plan -> build -> review) with tests required, human gates, and blocking guard extensions.",
   "author": { "name": "outofrange-consulting" },
   "license": "MIT",

diff --git a/plugins/dev-team/agents/a11y-review.md b/plugins/dev-team/agents/a11y-review.md
@@ -86,6 +86,10 @@ Focus management:
 
 Code style, naming, test coverage, performance (handled by other agents)
 
+## Output discipline
+
+Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).
+
 ## Self-Challenge
 
 After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#a11y-review` (the shared challenger loop + the a11y-review challenge questions; ≤3 rounds). Append a confidence level (High/Medium/Low) to the `summary` field.
diff --git a/plugins/dev-team/agents/arch-review.md b/plugins/dev-team/agents/arch-review.md
@@ -95,6 +95,10 @@ Grep for patterns that architecture documentation explicitly bans:
 - Direct `fetch`/`axios`/`HttpClient` calls outside designated HTTP adapter layer
 - Direct DB client calls outside designated repository layer
 
+## Output discipline
+
+Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).
+
 ## Self-Challenge
 
 After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#arch-review` (arch-review challenge questions). Append confidence level (High/Medium/Low) to the `summary` field.

diff --git a/plugins/dev-team/agents/complexity-review.md b/plugins/dev-team/agents/complexity-review.md
@@ -67,6 +67,10 @@ Cognitive load:
 - Too many concepts per function
 - Non-obvious control flow
 
+## Output discipline
+
+Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).
+
 ## Self-Challenge
 
 After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#structure-review`. Use the structure-review challenge questions (the nearest applicable section — no complexity-specific section exists). Append confidence level (High/Medium/Low) to the `summary` field.

diff --git a/plugins/dev-team/agents/concurrency-review.md b/plugins/dev-team/agents/concurrency-review.md
@@ -89,6 +89,10 @@ Resource ordering:
 
 Code style, naming, domain modeling, security, complexity (handled by other agents)
 
+## Output discipline
+
+Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).
+
 ## Self-Challenge
 
 After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#concurrency-review` (the shared challenger loop + the concurrency-review challenge questions; ≤3 rounds). Append a confidence level (High/Medium/Low) to the `summary` field.
diff --git a/plugins/dev-team/agents/doc-review.md b/plugins/dev-team/agents/doc-review.md
@@ -64,11 +64,23 @@ Return `{"status": "skip", "issues": [], "summary": "No documentation files foun
 - `docs/agent-architecture.md` references a configuration or governance detail that is no longer current
 - Agent or skill files changed without corresponding update to `CLAUDE.md` registry tables
 
+### Comment hygiene
+
+- **Tracker-ID references in shipped comments** — issue/epic/ticket IDs in code comments (`JIRA-123`, `PROJ-789`, `#456`, `closes GH-12`). The comment should explain *intent*; the tracker ID belongs in the commit message, not the source. Flag with a `suggestedFix` that rewrites the comment as a purpose statement.
+- **Detached / orphaned doc comments** — a JSDoc/docstring/XML-doc block separated from its symbol by blank lines or other statements, or attached to the wrong symbol (so tooling associates it incorrectly).
+- **Do NOT flag durable external standards** — `RFC-2119`, `ISO-4217`, `RFC 5322`, CVE IDs, and similar stable references are legitimate; they are not tracker IDs.
+
+Comment-hygiene and tracker-ID findings are **capped at `suggestion`**: on their own they never raise status above `warn`.
+
 ## Ignore
 
 Code correctness, naming conventions, test quality (handled by other agents)
 Doc style preferences (sentence case vs title case, oxford comma) — flag only when docs are wrong, not when they differ in style
 
+## Output discipline
+
+Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).
+
 ## Self-Challenge
 
 After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#doc-review` (the shared challenger loop + the doc-review challenge questions; ≤3 rounds). Append a confidence level (High/Medium/Low) to the `summary` field.
diff --git a/plugins/dev-team/agents/domain-review.md b/plugins/dev-team/agents/domain-review.md
@@ -86,6 +86,10 @@ Whole-file load: each linked skill is loaded in full when invoked.
 
 - [Ubiquitous Language](skill://ubiquitous-language) — invoke when the user asks to "build the glossary", "extract domain terms", or "document the ubiquitous language". Also invoke when domain-review findings show pervasive terminology inconsistency (3+ different names for the same concept across the codebase).
 
+## Output discipline
+
+Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).
+
 ## Self-Challenge
 
 After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#domain-review` (domain-review challenge questions). Append confidence level (High/Medium/Low) to the `summary` field.