Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/workflows/installers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,17 @@ jobs:
done
exit $rc

# 1d) Framework-compliance checks for the dev-team plugin: skill:// anchor
# resolution, index.json file integrity, review-agent output-discipline
# wiring, and the test-after stance (no removed TDD identifiers).
compliance:
name: dev-team framework compliance
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate framework compliance
run: node scripts/ci-framework-compliance.mjs

# 2) Real end-to-end install on every OS (bootstraps bun/omp/plugins, then
# verifies OMP launches and lists all plugins).
e2e:
Expand Down
2 changes: 1 addition & 1 deletion README.fr.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ met en place OMP et vous guide à travers chacun d'eux.

| Plugin | Rôle |
|---|---|
| **[`dev-team`](plugins/dev-team/)** | **Équipe de dev agentique** — un orchestrateur + 32 agents spécialistes/critiques, le workflow `/specs` → `/plan` → `/build` → `/pr`, **TDD strict** et points de contrôle humains, ~78 skills, et des extensions « garde-fou » bloquantes. Portage de [bdfinst/agentic-dev-team](https://github.com/bdfinst/agentic-dev-team) (Bryan Finster). Tiers 100 % cloud ; gardez le tier « small » à haut volume bon marché. |
| **[`dev-team`](plugins/dev-team/)** | **Équipe de dev agentique** — un orchestrateur + 32 agents spécialistes/critiques, le workflow `/specs` → `/plan` → `/build` → `/pr`, un **plan gate strict** (test-after, tests requis) et points de contrôle humains, ~78 skills, et des extensions « garde-fou » bloquantes. Portage de [bdfinst/agentic-dev-team](https://github.com/bdfinst/agentic-dev-team) (Bryan Finster). Tiers 100 % cloud ; gardez le tier « small » à haut volume bon marché. |
| **[`copilot-preset`](plugins/copilot-preset/)** | **Préréglage modèles GitHub Copilot** — route OMP (et les tiers de dev-team) via `github-copilot` pour tourner sur une licence Copilot. Config seulement : mapping tier→modèle, comparatif tarifaire (crédits IA post-juin 2026), et MAI-Code-1-Flash câblé. |
| **[`token-diet`](plugins/token-diet/)** | **Réduction agressive des tokens** — ctx-wire (compression transparente de la sortie des commandes + scrub des secrets), CodeGraph (requêtes de graphe de symboles via MCP au lieu de grep+read), un skill « caveman » de sortie laconique, et un skill « yagni » de code minimal — par-dessus la compaction/`astGrep` natives d'OMP. |
| **[`azure-devops-fs`](plugins/azure-devops-fs/)** | **Azure DevOps comme un système de fichiers** — lecture repos/fichiers/PR/diffs via URIs `ado://` (paginé), **gates/policies** de PR + CI (builds/logs/run), création/checkout/push/complete de PR, commentaires/votes. Propulsé par l'**Azure CLI** (`az` + extension azure-devops), auth PAT, cache SQLite ; fonctionne derrière les proxys TLS d'entreprise. |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ you through them.

| Plugin | What it does |
|---|---|
| **[`dev-team`](plugins/dev-team/)** | **Agentic dev team** — orchestrator + 32 specialist/critic agents, the `/specs` → `/plan` → `/build` → `/pr` workflow, **strict TDD** and human gates, ~78 skills, and blocking guard extensions. Port of [bdfinst/agentic-dev-team](https://github.com/bdfinst/agentic-dev-team) (Bryan Finster). All-cloud tiers; keep the high-volume small tier cheap. |
| **[`dev-team`](plugins/dev-team/)** | **Agentic dev team** — orchestrator + 32 specialist/critic agents, the `/specs` → `/plan` → `/build` → `/pr` workflow, a **forced plan gate** (test-after, tests required) and human gates, ~78 skills, and blocking guard extensions. Port of [bdfinst/agentic-dev-team](https://github.com/bdfinst/agentic-dev-team) (Bryan Finster). All-cloud tiers; keep the high-volume small tier cheap. |
| **[`copilot-preset`](plugins/copilot-preset/)** | **GitHub Copilot model preset** — route OMP (and the dev-team tiers) through `github-copilot` to run on a Copilot license. Config-only: tier→model mapping, post-June-2026 AI-credit pricing comparison, and MAI-Code-1-Flash wired in. |
| **[`token-diet`](plugins/token-diet/)** | **Aggressive token reduction** — ctx-wire (transparent command-output compression + secret scrub), CodeGraph (MCP symbol/call-graph queries instead of grep+read), a caveman terse-output skill, and a yagni minimal-code skill — layered on OMP's native compaction/`astGrep`. |
| **[`azure-devops-fs`](plugins/azure-devops-fs/)** | **Azure DevOps as a filesystem** — read repos/files/PRs/diffs via `ado://` URIs (paginated), PR **gates/policies** + CI (builds/logs/run), create/checkout/push/complete PRs, comment/vote. Backed by the **Azure CLI** (`az` + the azure-devops extension), PAT auth, SQLite read cache; works behind corporate TLS proxies. |
Expand Down
4 changes: 2 additions & 2 deletions REVIEW.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ cohérence docs/registres — pas des cassures de build.**

### 1. Les « guards » de sécurité sont du théâtre de sécurité (le plus important)

Les 6 guards de `dev-team` (destructive, path, freeze, tdd, review-gate,
Les 6 guards de `dev-team` (destructive, path, freeze, spec, review-gate,
careful) sont des hooks *PreToolUse consultatifs* basés sur du matching de
sous-chaîne / glob. Ils donnent une **fausse confiance** : ils sont
contournables trivialement, par accident comme volontairement.
Expand All @@ -34,7 +34,7 @@ contournables trivialement, par accident comme volontairement.
| destructive-guard | `rm -rf`, drop/truncate, force-push, kill… | `find -delete`, `git clean -fdx`, `> f`, `truncate -s0`, `bash -c …`, obfuscation par variable ; **warn-only hors `/careful on`** ; la SAFE-list court-circuite tout (`rm -rf node_modules/../../etc`) | Très faible |
| path-guard | édition de `.env`/`*.pem`/`*.key`/`id_rsa`… | aucune couverture `bash` (`tee`/`>`/`sed -i`) ; regex **sensible à la casse** → `ID_RSA`/`.PEM` passent ; lectures non gardées ; pass silencieux si le shape d'edit ne matche pas | Faible |
| freeze-guard | écriture sur globs gelés | **aucune branche `bash`** ; écraser `.omp/state/freeze.json` suffit | Faible |
| tdd-guard (.feature) | modif de specs BDD | `BASH_WRITE_RE` étroit (rate `python -c`/`ed`/heredoc) ; opt-out `OMP_ALLOW_FEATURE_EDITS=1` | Faible–moyenne |
| spec-guard (.feature) | modif de specs BDD | `BASH_WRITE_RE` étroit (rate `python -c`/`ed`/heredoc) ; opt-out `OMP_ALLOW_FEATURE_EDITS=1` | Faible–moyenne |
| review-gate | `git commit` avant approve | `--no-verify` **explicitement autorisé** ; `bash -c 'git commit'` ; le `--no-verify` est détecté par `includes` donc un message de commit le déclenche | Faible |
| careful-mode | active le blocage | fichier d'état modifiable par l'agent ; **OFF par défaut** | Faible |

Expand Down
105 changes: 105 additions & 0 deletions docs/upstream-v7.7-7.9-extraction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Extraction from upstream agentic-dev-team (v7.7–v7.9)

Survey of `bdfinst/agentic-dev-team` releases since our v7.6 extraction, and what
we pulled into our OMP port — filtered through omp-dev-team's choices:
**test-after with refactoring (no TDD), quality first, cost efficiency.**

## Upstream evolution since v7.6

| Release | Highlight |
|---|---|
| v7.7.0 | harness fixes from session-review/audit; **closed learning loop**; **when-tdd-pays** experiment fixtures; ambiguity-resolution protocol for `/specs` + `/ship` gate |
| v7.8.0 | **craftsmanship-axis review rules** — use-the-platform, comment hygiene; named shipped AC references |
| v7.9.0 | **deterministic status + finding grouping** for doc/naming review agents |

## Extracted (respecting omp choices)

1. **Deterministic status + finding grouping (v7.9).** New shared knowledge file
`skills/dev-team-knowledge/review-output-discipline.md`:
- **Deterministic status** — an agent's `status` is a pure function of the
highest-severity finding, never of volume.
- **Finding grouping** — Enumerate → Classify → Group; consolidate same-kind
findings into ~3–5 concept-level findings per file; keep `error` findings
individual.

Wired as a one-line anchored reference into **all 17** finding-emitting review
agents (upstream changed only doc/naming — we factored it into one shared file
instead of copy-pasting; this is the DRY, cost-efficient win and propagates
determinism + token savings to every review). Added to `index.json`.

2. **Comment hygiene (v7.8) → `doc-review`.** Tracker-ID references in shipped
comments (`JIRA-123`, `#456`), detached/orphaned doc comments; **capped at
`suggestion`** (never raise status above `warn`); durable external standards
(`RFC-2119`, `ISO-4217`, CVE) are explicitly not flagged.

3. **Use-the-platform (v7.8) → `refactor-opportunity-review`.** Reinvented
built-ins (`min`/`max`/`sum`/`clamp`/`copy`), reinvented helpers, open-coded
idioms repeated 3+ times — mapped by concept, honoring language **and version**
(e.g. Go <1.21 has no builtin `min`/`max`). Framing de-TDD'd to test-after.

4. **Closed learning loop (v7.7) → `feedback-learning` skill.** We already had
post-task reflection + recurring-correction detection (3+); the missing
"closed" half was a persistent queue + disposition. Added a
`metrics/pending-review.jsonl` queue (system proposals enqueued, never
dropped) and a **session-review** disposition flow (`review` keyword) that
previews, then approves (apply + log + stamp `approved`) or rejects (stamp
`rejected`). Project-local only — plugin-cache-safe. Asynchronous/batched by
design, which keeps it cheap.

## Test-after reinforcement (omp north star)

Beyond extraction, a pass to make **test-after with refactoring** explicit and to
remove residual TDD framing the earlier plan-gate-over-tdd move had left behind:

- **Refactor after green, every step** — promoted from "optional" to a deliberate
always-taken pass (the `refactor-opportunity-review` lens) in `skills/build`,
`prompts/implementer.md`, and the orchestrator's Phase 3. Changes are made only
when there's a real opportunity, but the pass is always taken — the *refactoring*
half of test-after-with-refactoring.
- **Residual TDD traces reframed** to test-after: `triage` skill + command (RED/GREEN
fix plan → regression-test + fix + refactor), `qa-engineer` (ATDD → acceptance
scenarios; unit tests follow, test-after), `plan` (TDD step/traceability →
build step / step-to-scenario), `progress-guardian` (flagged "tests not written
first" → flags missing tests, order-agnostic), `mutation-testing` /
`quality-gate-pipeline` / `init-dev-team` (RED-GREEN labels dropped, semantics
kept), `test-design-reviewer` ("First/written-first" rubric → "Timely/ships with
the implementation"), plus the root `README`/`README.fr` ("strict TDD" → forced
plan gate, test-after) and `REVIEW.md` (stale `tdd-guard` → `spec-guard`).

## Deliberately NOT extracted (with rationale)

- **when-tdd-pays experiment fixtures (v7.7).** Upstream is re-litigating where
test-first pays. omp-dev-team has already made the call (test-after + plan gate);
importing TDD experiment fixtures would reintroduce exactly what we removed.
- **`/ship` gate + ambiguity-resolution protocol (v7.7).** The ambiguity protocol
is reasonable, but it is wired to a `/ship` command we don't have; our `/specs`
already runs a consistency gate. Candidate for a focused follow-up if a gap shows.
- **Bibliographic TDD citations kept as-is.** e.g. `testability-patterns.md` cites
*Growing Object-Oriented Software, Guided by Tests* ("outside-in TDD") — that is
the book's actual subject; rewriting a citation would misrepresent the source.

## Verified

`ci-validate-json` 23/23 · all 10 dev-team extensions compile · unit suite green ·
both `review-output-discipline` anchors (`#deterministic-status`,
`#finding-grouping`) resolve from all 17 wired agents · no prescriptive TDD /
test-first / RED-GREEN traces remain outside historical `docs/` and the one book
citation.

## CI: framework-compliance checks

What earlier extraction PRs verified by hand is now enforced by CI —
`scripts/ci-framework-compliance.mjs` (pure Node, new `compliance` job):

- **Anchor resolution** — every `skill://dev-team-knowledge/<file>.md#<anchor>`
reference resolves (file exists; anchor matches a heading slug or a registered
`index.json` anchor). Catches exactly the rename/typo failure mode this work
risked across 17 agents.
- **index.json integrity** — every keyed file exists.
- **Review-agent wiring** — every finding-emitting agent references
`review-output-discipline.md` (allowlist: `progress-guardian`).
- **Test-after stance** — the deliberately-removed TDD identifiers (`tdd-first`,
`tdd-guard`, `test-driven-development`, `RED-GREEN`) can't creep back in,
outside a small rationale/historical allowlist.

Currently: 26 anchor refs + 39 index files checked — 0 violations.
2 changes: 1 addition & 1 deletion plugins/dev-team/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "dev-team",
"version": "1.2.0",
"version": "1.3.0",
"description": "Agentic dev team for Oh-My-Pi (ported from bdfinst/agentic-dev-team): orchestrator + 32 specialist/critic agents, the /specs -> /plan -> /build -> /pr workflow, a forced plan gate (scope -> plan -> build -> review) with tests required, human gates, and blocking guard extensions.",
"author": { "name": "outofrange-consulting" },
"license": "MIT",
Expand Down
4 changes: 4 additions & 0 deletions plugins/dev-team/agents/a11y-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,10 @@ Focus management:

Code style, naming, test coverage, performance (handled by other agents)

## Output discipline

Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).

## Self-Challenge

After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#a11y-review` (the shared challenger loop + the a11y-review challenge questions; ≤3 rounds). Append a confidence level (High/Medium/Low) to the `summary` field.
4 changes: 4 additions & 0 deletions plugins/dev-team/agents/arch-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,10 @@ Grep for patterns that architecture documentation explicitly bans:
- Direct `fetch`/`axios`/`HttpClient` calls outside designated HTTP adapter layer
- Direct DB client calls outside designated repository layer

## Output discipline

Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).

## Self-Challenge

After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#arch-review` (arch-review challenge questions). Append confidence level (High/Medium/Low) to the `summary` field.
Expand Down
4 changes: 4 additions & 0 deletions plugins/dev-team/agents/complexity-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,10 @@ Cognitive load:
- Too many concepts per function
- Non-obvious control flow

## Output discipline

Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).

## Self-Challenge

After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#structure-review`. Use the structure-review challenge questions (the nearest applicable section — no complexity-specific section exists). Append confidence level (High/Medium/Low) to the `summary` field.
Expand Down
4 changes: 4 additions & 0 deletions plugins/dev-team/agents/concurrency-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,10 @@ Resource ordering:

Code style, naming, domain modeling, security, complexity (handled by other agents)

## Output discipline

Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).

## Self-Challenge

After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#concurrency-review` (the shared challenger loop + the concurrency-review challenge questions; ≤3 rounds). Append a confidence level (High/Medium/Low) to the `summary` field.
12 changes: 12 additions & 0 deletions plugins/dev-team/agents/doc-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,11 +64,23 @@ Return `{"status": "skip", "issues": [], "summary": "No documentation files foun
- `docs/agent-architecture.md` references a configuration or governance detail that is no longer current
- Agent or skill files changed without corresponding update to `CLAUDE.md` registry tables

### Comment hygiene

- **Tracker-ID references in shipped comments** — issue/epic/ticket IDs in code comments (`JIRA-123`, `PROJ-789`, `#456`, `closes GH-12`). The comment should explain *intent*; the tracker ID belongs in the commit message, not the source. Flag with a `suggestedFix` that rewrites the comment as a purpose statement.
- **Detached / orphaned doc comments** — a JSDoc/docstring/XML-doc block separated from its symbol by blank lines or other statements, or attached to the wrong symbol (so tooling associates it incorrectly).
- **Do NOT flag durable external standards** — `RFC-2119`, `ISO-4217`, `RFC 5322`, CVE IDs, and similar stable references are legitimate; they are not tracker IDs.

Comment-hygiene and tracker-ID findings are **capped at `suggestion`**: on their own they never raise status above `warn`.

## Ignore

Code correctness, naming conventions, test quality (handled by other agents)
Doc style preferences (sentence case vs title case, oxford comma) — flag only when docs are wrong, not when they differ in style

## Output discipline

Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).

## Self-Challenge

After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#doc-review` (the shared challenger loop + the doc-review challenge questions; ≤3 rounds). Append a confidence level (High/Medium/Low) to the `summary` field.
4 changes: 4 additions & 0 deletions plugins/dev-team/agents/domain-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,10 @@ Whole-file load: each linked skill is loaded in full when invoked.

- [Ubiquitous Language](skill://ubiquitous-language) — invoke when the user asks to "build the glossary", "extract domain terms", or "document the ubiquitous language". Also invoke when domain-review findings show pervasive terminology inconsistency (3+ different names for the same concept across the codebase).

## Output discipline

Derive `status` from the highest-severity finding, never from volume (`skill://dev-team-knowledge/review-output-discipline.md#deterministic-status`), and group same-kind findings — enumerate → classify → group — into ~3–5 concept-level findings per file, keeping `error` findings individual (`skill://dev-team-knowledge/review-output-discipline.md#finding-grouping`).

## Self-Challenge

After producing findings, run the adversarial challenge pass from `skill://dev-team-knowledge/adversarial-review-protocol.md#domain-review` (domain-review challenge questions). Append confidence level (High/Medium/Low) to the `summary` field.
Expand Down
Loading
Loading