[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment — April 2026 #1789

2026-04-08T10:58:11Z

github-actions[bot]
bot Apr 8, 2026

📊 Executive Summary

gh-aw-firewall is a maturity-level 5 agentic repository — one of the most automated projects in the Pelis Agent Factory ecosystem, with 27 agentic workflows and 18 standard CI/CD workflows. The core patterns (triple-engine red team, token analyzer→optimizer chaining, cross-repo dispatch, smoke suites) are all well-implemented. The top opportunities are closing the Codex token-visibility gap, adding a container image security scanner, and keeping ci-doctor's monitored-workflow list in sync as the workflow fleet grows.

🎓 Patterns Learned vs. Current Repo

Note: Pelis Agent Factory documentation was unavailable in cache this run; analysis draws from prior run cache (2026-04-07) and live repository inspection.

Pattern	Status in this repo
Triple-engine smoke tests	✅ Claude / Copilot / Codex / Chroot / Services
Token analyzer → optimizer chaining	✅ Claude + Copilot — ⚠️ Codex missing
`workflow_run` event chaining	✅ Used by token-optimizers
`skip-if-match` deduplication guard	✅ doc-maintainer, test-coverage-improver, issue-monster
Shared imports (`mcp-pagination`, `reporting`, `secret-audit`)	✅ Well adopted
`cache-memory` for persistent state	✅ issue-duplication-detector, smoke-services
Cross-repo dispatch with PAT	✅ firewall-issue-dispatcher → github/gh-aw
Slash-command workflows	✅ `/plan` command
CI Doctor for automated failure investigation	✅ Present and monitoring 26 workflows
Issue Monster (Copilot SWE auto-assignment)	✅ Active
Container image security scanning	❌ Not yet present
PR general quality review agent	❌ Only security coverage
Stale issue / housekeeping agent	❌ Not present
Performance regression → agentic alert	❌ performance-monitor.yml is non-agentic

📋 Workflow Inventory

Agentic Workflows (27)

Workflow	Purpose	Trigger	Assessment
`smoke-claude.md`	Validate Claude engine on PRs + schedule	PR, every 12h, reaction:heart	✅ Solid
`smoke-codex.md`	Validate Codex engine	PR, every 12h, reaction:hooray	✅ Solid
`smoke-copilot.md`	Validate Copilot engine	PR, every 12h, reaction:eyes	✅ Solid
`smoke-chroot.md`	Validate chroot feature	PR (path-filtered), reaction:rocket	✅ Well-scoped paths
`smoke-services.md`	Validate `--allow-host-service-ports`	PR, every 12h, reaction:rocket	✅ Good service coverage
`secret-digger-claude.md`	Red team — Claude engine	Hourly cron (:05)	✅ Runs every hour
`secret-digger-codex.md`	Red team — Codex engine	Hourly cron (:10)	✅ Runs every hour
`secret-digger-copilot.md`	Red team — Copilot engine	Hourly cron (:00)	✅ Runs every hour
`security-guard.md`	PR security review	PR open/sync	✅ Strong
`security-review.md`	Daily threat modeling	Daily schedule	✅ Ingests red team results
`dependency-security-monitor.md`	CVE triage + dep update PRs	Daily	✅ Creates issues + PRs
`build-test.md`	Polyglot build validation	PR, workflow_dispatch	⚠️ Copilot-only engine
`claude-token-usage-analyzer.md`	Claude cost visibility	Daily	✅
`copilot-token-usage-analyzer.md`	Copilot cost visibility	Daily	✅
`claude-token-optimizer.md`	Claude cost recommendations	On analyzer completion	✅
`copilot-token-optimizer.md`	Copilot cost recommendations	On analyzer completion	✅
`doc-maintainer.md`	Doc sync with code changes	Daily, skip-if-match	✅ Good
`test-coverage-improver.md`	Security-critical test coverage PRs	Weekly, skip-if-match	✅
`ci-doctor.md`	CI failure investigation	workflow_run (failures)	⚠️ Missing some new workflows
`ci-cd-gaps-assessment.md`	CI/CD gap analysis	Daily	✅
`firewall-issue-dispatcher.md`	Cross-repo issue triage (gh-aw→gh-aw-firewall)	Every 6h	✅
`issue-duplication-detector.md`	Duplicate issue detection	issues:opened	✅ Uses cache-memory
`issue-monster.md`	Auto-assign issues to Copilot SWE	issues:opened, every 1h	✅
`plan.md`	`/plan` slash command	slash_command	✅
`cli-flag-consistency-checker.md`	CLI docs/code sync check	Weekly	✅
`update-release-notes.md`	Enhance release notes	release:published	✅
`pelis-agent-factory-advisor.md`	This workflow	Schedule	✅

Standard CI/CD Workflows (18)

build.yml, lint.yml, test-coverage.yml, test-integration.yml, test-chroot.yml, test-examples.yml, test-action.yml, codeql.yml, dependency-audit.yml, performance-monitor.yml, release.yml, deploy-docs.yml, docs-preview.yml, link-check.yml, pr-title.yml, copilot-setup-steps.yml, test-integration-suite.yml, agentics-maintenance.yml

🚀 Recommendations

P0 — High Impact, Low Effort (Quick Wins)

1. Codex Token Usage Analyzer + Optimizer

What: Add codex-token-usage-analyzer.md and codex-token-optimizer.md mirroring the existing Claude/Copilot pair.

Why: The Codex secret-digger runs hourly (3×/day more than others). Codex token costs are entirely invisible — no reporting, no optimization loop. The Claude and Copilot patterns are proven; this is a direct copy-and-adapt.

How:

# codex-token-usage-analyzer.md
description: Daily Codex token usage analysis across agentic workflow runs
on:
  schedule: daily
  workflow_dispatch:
imports:
  - shared/mcp-pagination.md
  - shared/reporting.md
network:
  allowed:
    - github
    - "*.blob.core.windows.net"
tools:
  github:
    toolsets: [default, actions]
  bash: true
safe-outputs:
  create-issue:
    title-prefix: "📊 Codex Token Usage Report"
    labels: [codex-token-usage-report]
    close-older-issues: true

Then codex-token-optimizer.md triggering on workflow_run: ["Daily Codex Token Usage Analyzer"].

Effort: Low — copy claude pattern, change engine filter and labels.

2. Update `ci-doctor.md` Monitored Workflow List

What: Add missing workflows to the workflow_run.workflows list in ci-doctor.md:

"Smoke Services", "CLI Flag Consistency Checker", "Firewall Issue Dispatcher", "CI/CD Pipelines and Integration Tests Gap Assessment", "Smoke Services", "Update Release Notes", "Secret Digger (Claude)", "Secret Digger (Codex)", "Secret Digger (Copilot)"

Why: When these workflows fail silently, no investigation issue is created. The secret-diggers are especially important — a silent failure means a gap in hourly security coverage.

Effort: Low — edit one YAML list.

P1 — High Impact, Medium Effort

3. Container Image Security Scanner Agent

What: Add container-image-scanner.md — an agentic workflow that runs Trivy or Grype against the Docker images in containers/ (squid, agent, api-proxy) and creates issues for HIGH/CRITICAL findings.

Why: The repo ships security-critical Docker containers. codeql.yml scans source code; dependency-audit.yml scans npm dependencies; but nobody scans the container images themselves. A vulnerability in ubuntu/squid or ubuntu:22.04 base images is a direct security boundary breach.

How:

on:
  schedule: daily
  push:
    paths: ["containers/**"]
  workflow_dispatch:
engine:
  id: claude
safe-outputs:
  create-issue:
    title-prefix: "[Container CVE] "
    labels: [security, container]
    max: 5
    expires: 30d

Agent builds images with --build-local, runs trivy image awf-agent, parses JSON output, creates issues for HIGH+ findings.

Effort: Medium — requires --build-local in CI runner, Trivy install step.

4. PR Code Quality Review Agent

What: Add code-review.md — a general-purpose PR reviewer complementing security-guard.md with quality/correctness focus (logic bugs, TypeScript type safety, test coverage gaps in changed files, performance anti-patterns).

Why: security-guard.md explicitly focuses on security posture. There is no agent reviewing PRs for general code quality, correctness, or maintainability. Given the complexity of src/docker-manager.ts (1000+ lines) and src/cli.ts, automated quality review would catch regressions.

How:

on:
  pull_request:
    types: [opened, synchronize]
    paths: ["src/**", "containers/**", "tests/**"]
engine:
  id: claude
  max-turns: 10
safe-outputs:
  add-comment:
    max: 1
    hide-older-comments: true

Effort: Medium — prompt engineering needed to avoid noise.

5. Performance Regression Detector Agent

What: Add an agentic layer on top of performance-monitor.yml that reads benchmark results and creates GitHub issues when regressions exceed a threshold.

Why: performance-monitor.yml runs weekly benchmarks and posts results but is a pure shell script — no intelligence, no issue creation, no trend analysis. An agent reading the JSON output could detect regressions, compare to baselines stored in cache-memory, and create actionable issues.

How: Trigger on workflow_run: ["Performance Monitor"]. Agent reads artifact JSON, compares to cached baseline in cache-memory, creates issue if startup time/memory regresses >10%.

Effort: Medium — requires understanding benchmark output format.

P2 — Medium Impact

6. Stale Issue Housekeeping Agent

What: Add stale-issue-manager.md that pings stale issues/PRs, applies labels, and closes truly abandoned items after extended periods.

Why: As issue-monster assigns Copilot SWE agents to issues, completed or abandoned issues may accumulate. Automated housekeeping keeps the issue tracker actionable.

Effort: Medium — standard pattern, requires issues: write permission via safe-outputs.

7. Migration Guide Generator on Release

What: Extend update-release-notes.md or add release-migration-guide.md that detects breaking CLI flag changes and auto-generates a migration guide as a discussion or release attachment.

Why: AWF is a security tool with a CLI API. Breaking changes to flags (e.g., --allow-host-service-ports behavior changes) need clear migration docs. The cli-flag-consistency-checker.md already tracks flag changes weekly.

Effort: Medium — chain with cli-flag-consistency-checker output or diff src/cli.ts between release tags.

8. SBOM / Supply Chain Attestation Agent

What: Add supply-chain-attestation.md that generates and publishes SBOMs for each release using syft or npm sbom, verifies SLSA provenance, and monitors for new transitive dependency additions.

Why: As a security tool published to GHCR, gh-aw-firewall should practice what it preaches. An SBOM provides transparency for downstream users and enables supply chain auditing.

Effort: Medium — SBOM generation is straightforward; verification and publishing requires release permissions.

P3 — Nice to Have

9. Integration Test Gap Agent (extend `ci-cd-gaps-assessment.md`)

What: Enhance the existing ci-cd-gaps-assessment.md to not just report gaps but also create issues with proposed test cases and assign them via issue-monster.

Why: Currently it reports gaps as a discussion. Closing the loop to issue creation + Copilot assignment would make it self-healing.

Effort: Low (extend existing workflow).

10. Domain Whitelist Audit Agent

What: Add domain-whitelist-auditor.md that scans all workflow .md files for network.allowed domains, checks them against a known-safe list, and flags suspicious or overly broad allowances.

Why: As the fleet of agentic workflows grows, network permissions can drift. An agent auditing domain allowances weekly would catch cases where workflows silently gain access to unexpected endpoints.

Effort: Low — grep-based analysis, no external calls needed.

📈 Maturity Assessment

Dimension	Current	Target	Gap
Security automation	5/5	5/5	✅ Red team + security review + CVE monitoring
Token cost visibility	4/5	5/5	⚠️ Codex chain missing
Code quality automation	3/5	5/5	❌ No quality review agent
Release automation	4/5	5/5	⚠️ No migration guide / SBOM
Container security	3/5	5/5	❌ No image scanner
CI observability	4/5	5/5	⚠️ ci-doctor list incomplete
Issue lifecycle	4/5	5/5	⚠️ No stale management
Cross-repo coordination	5/5	5/5	✅ firewall-issue-dispatcher
Multi-engine coverage	5/5	5/5	✅ All three engines covered

Overall maturity: 4.3/5 — Elite tier. Two targeted gaps (Codex chain, container scanning) prevent a perfect score.

🔄 Best Practice Comparison

What This Repo Does Exceptionally Well

Hourly red team at all three engines — most repos run security tests weekly. Running Claude/Copilot/Codex every hour with offset crons is best-in-class.
Token analyzer→optimizer chaining — the workflow_run trigger pattern creating a cost-reduction feedback loop is a sophisticated and reusable pattern.
Shared import library — mcp-pagination.md, secret-audit.md, reporting.md, version-reporting.md demonstrate disciplined modularity.
skip-if-match guards — preventing duplicate open PRs/issues on recurring workflows is a sign of operational maturity.
Cache-memory for stateful agents — issue-duplication-detector.md using persistent cache for issue signatures is exactly the right use of that capability.

What To Improve

Codex token visibility is the only major asymmetry across engines.
ci-doctor coverage drifts as new workflows are added — consider automating this list from the filesystem rather than a hardcoded YAML list.
Container images are a blind spot — source code and npm deps are scanned; the Docker images themselves are not.

📝 Notes

Cache updated: repo-analysis-2026-04-08.json written to /tmp/gh-aw/cache-memory/. Next run should check for date change to detect repo drift. pelis-agent-factory-docs.txt was not available in the workflow environment this run (hash: no-pelis-docs-available-2026-04-08).

Generated by Pelis Agent Factory Advisor · Run ID: 24131466171

Generated by Pelis Agent Factory Advisor · ● 619K · ◷

expires on Apr 15, 2026, 10:58 AM UTC

2026-04-15T12:56:52Z

github-actions[bot]
bot Apr 15, 2026
Author

This discussion was automatically closed because it expired on 2026-04-15T10:58:11.148Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment — April 2026 #1789

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment — April 2026 #1789

Uh oh!

github-actions[bot] bot Apr 8, 2026

📊 Executive Summary

🎓 Patterns Learned vs. Current Repo

📋 Workflow Inventory

Agentic Workflows (27)

Standard CI/CD Workflows (18)

🚀 Recommendations

P0 — High Impact, Low Effort (Quick Wins)

1. Codex Token Usage Analyzer + Optimizer

2. Update ci-doctor.md Monitored Workflow List

P1 — High Impact, Medium Effort

3. Container Image Security Scanner Agent

4. PR Code Quality Review Agent

5. Performance Regression Detector Agent

P2 — Medium Impact

6. Stale Issue Housekeeping Agent

7. Migration Guide Generator on Release

8. SBOM / Supply Chain Attestation Agent

P3 — Nice to Have

9. Integration Test Gap Agent (extend ci-cd-gaps-assessment.md)

10. Domain Whitelist Audit Agent

📈 Maturity Assessment

🔄 Best Practice Comparison

What This Repo Does Exceptionally Well

What To Improve

📝 Notes

Replies: 1 comment

Uh oh!

github-actions[bot] bot Apr 15, 2026 Author

github-actions[bot]
bot Apr 8, 2026

2. Update `ci-doctor.md` Monitored Workflow List

9. Integration Test Gap Agent (extend `ci-cd-gaps-assessment.md`)

github-actions[bot]
bot Apr 15, 2026
Author