[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations (April 2026) #1743
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-04-14T10:57:45.865Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Executive Summary
This repository is an exceptionally mature agentic workflow operator — with 27 agentic
.mdworkflow definitions and 18 traditional GitHub Actions workflows, it is literally the dogfood platform for AWF itself. The automation coverage is comprehensive across security, testing, documentation, cost management, and issue lifecycle. The primary opportunities lie in filling a few specific gaps (Codex cost visibility, container image scanning) and adding intelligence layers on top of existing automation (performance regression detection, PR code quality review).🎓 Patterns Learned & Applied
The following patterns from the Pelis Agent Factory were observed and applied to this analysis:
workflow_runtrigger links analyzer to optimizerimports:for reusable fragments (mcp-pagination.md, etc.)issue-duplication-detector,security-reviewfirewall-issue-dispatcherissue-monsterci-doctor/plancommand on issues/discussionsplan.mdagentics-maintenance.yml📋 Workflow Inventory
Agentic Workflows (27 total)
build-testci-cd-gaps-assessmentci-doctorworkflow_runfailureclaude-token-usage-analyzershared/reporting.mdclaude-token-optimizerworkflow_runcli-flag-consistency-checkercopilot-token-usage-analyzercopilot-token-optimizerdependency-security-monitorexpiresdoc-maintainerfirewall-issue-dispatcherawfissue trackingissue-duplication-detectorissues.openedissue-monsterissues.openedpelis-agent-factory-advisorplan/plansecret-digger-claudesecret-digger-codexsecret-digger-copilotsecurity-guardsecurity-reviewsmoke-chrootsmoke-claudesmoke-codexsmoke-copilotsmoke-services--allow-host-service-portstest-coverage-improverupdate-release-notesrelease.publishedStandard (Non-Agentic) Workflows (18 total)
build.yml,codeql.yml,dependency-audit.yml,deploy-docs.yml,docs-preview.yml,link-check.yml,lint.yml,performance-monitor.yml,pr-title.yml,release.yml,test-action.yml,test-chroot.yml,test-coverage.yml,test-examples.yml,test-integration-suite.yml,test-integration.yml,agentics-maintenance.yml,copilot-setup-steps.yml🚀 Recommendations
P0 — High Impact, Low Effort (Quick Wins)
1. Codex Token Usage Analyzer + Optimizer
What: Add
codex-token-usage-analyzer.mdandcodex-token-optimizer.mdmirroring the Claude/Copilot pair.Why: Secret diggers run 3 Codex agents hourly — that's ~72 Codex runs/day. Without cost visibility, Codex spend is a blind spot while Claude and Copilot are fully instrumented.
How: Copy
copilot-token-usage-analyzer.md, changeenginefilter tocodex, adjust labels. Chain with an optimizer viaworkflow_run. Theshared/reporting.mdimport is already reusable.Effort: Low — ~30 min, straightforward template adaptation.
Example frontmatter:
2. Performance Regression Detector (Agentic Layer)
What: Add
performance-regression-detector.mdthat triggers afterperformance-monitor.ymlcompletes, reads benchmark results, and creates issues when regressions exceed threshold.Why:
performance-monitor.ymlruns benchmarks weekly but produces raw JSON — no intelligence layer detects regressions or alerts maintainers. The CI Doctor pattern (trigger onworkflow_run) is already proven here.How: Trigger on
workflow_run: [Performance Monitor], download thebenchmark-resultsartifact, compare to cached baseline, file issues on ≥10% regression.Effort: Low-Medium — the
workflow_run+ artifact read pattern is used in token analyzers.P1 — High Impact, Medium Effort
3. Container Image Security Scanner
What: A workflow using Trivy or Grype to scan the three AWF Docker images (
squid,agent,api-proxy) published to GHCR for OS-level CVEs in container layers.Why: CodeQL covers TypeScript/JS source, and
dependency-security-monitorcovers npm packages — but container image layers (Ubuntu 22.04 base, Squid packages, Node runtime) are not scanned. For a security product that ships container images, this is a meaningful gap. Container CVEs in base images won't appear in npm audit or CodeQL.How: Use
workflow_run: [Release]or a weekly schedule. Pull images from GHCR, run Trivy in SARIF mode, upload to GitHub Security tab. Alternatively create issues for CRITICAL/HIGH findings.Effort: Medium — requires authenticated GHCR pull, Trivy setup, SARIF upload or issue creation.
4. PR Code Quality Review Agent
What: A general-purpose code quality review agent that runs on PRs alongside
security-guard, focusing on correctness, maintainability, and TypeScript patterns — not just security.Why:
security-guard(Claude) reviews security boundaries exclusively.build-test(Copilot) validates that tests pass. Neither reviews code quality: complexity, test coverage of new paths, TypeScript antipatterns, or architectural consistency. The reviewer gap is especially notable given this repo's critical security posture.How: PR trigger with
pull_request: [opened, synchronize], Claude engine for nuanced reasoning, limited toadd-commentwith 1 max to avoid noise. Useskip-if-matchto avoid running on trivial/docs-only PRs.Effort: Medium — prompt engineering to scope review to non-security quality concerns without overlap with security-guard.
5. Stale Issue Manager
What: A weekly agentic workflow that identifies issues with no activity for 30+ days, posts contextual follow-up questions (not just "is this still relevant?"), and applies
stalelabels.Why:
agentics-maintenance.ymlhandlesexpires-tagged agent-created entities, but human-filed issues with noexpiresfield can accumulate indefinitely. The issue-monster assigns issues, but if agents can't reproduce or clarify, issues stall silently.How: Weekly schedule,
github.toolsets: [issues], fetch issues with no activity > 30d, generate context-aware follow-up questions based on issue body, post comment, applystalelabel. Useskip-if-matchto avoid running when too many stale issues already have pending comments.Effort: Medium — requires careful prompt to generate useful (not generic) follow-up questions.
P2 — Medium Impact
6. Firewall Domain Whitelist Auditor
What: Monthly agent that audits domain whitelists in smoke test configurations and the
--allow-domainsexamples in docs/README, verifying domains are reachable, still needed, and not overly permissive.Why: As this codebase evolves, domain allowlists in smoke test
.mdfiles may include domains that are no longer needed, have moved, or have become overly broad (e.g., wildcard domains). A security-focused repo should continuously validate its own examples.Effort: Low-Medium — bash DNS checks + GitHub content reads.
7. Breaking Change Detector
What: A PR-triggered agent that detects potentially breaking changes to the public CLI interface (
src/cli.tsflag additions/removals/renames) and the Docker Compose API generated bysrc/docker-manager.ts, and adds a warning comment.Why: AWF is consumed by other tools (gh-aw extension, CI pipelines). Unintentional breaking changes to CLI flags or Docker Compose structure could silently break consumers.
security-guarddoesn't cover this angle.Effort: Medium — requires understanding of semver impact from diff analysis.
8. Issue Triage Enhancer
What: Complement
issue-monsterwith a pre-assignment triage step that labels issues by category (bug/feature/docs/security), estimates complexity, and asks clarifying questions before Copilot picks them up.Why:
issue-monsterassigns issues directly. Better triage before assignment means Copilot agents get better-scoped work items, reducing wasted agent turns.Effort: Medium — two-phase pipeline, needs coordination with
issue-monstervia labels.P3 — Nice to Have
9. AWF API Contract Drift Detector
What: Weekly check that
src/types.tsinterfaces haven't changed in ways that break the published API contract documented in docs, creating issues when drift is detected.10. Contributor Onboarding Assistant
What: Triggered by
pull_requestfrom first-time contributors, explains relevant code patterns and points to CONTRIBUTING.md sections most relevant to their changes.📈 Maturity Assessment
Overall maturity: 4.5/5 — One of the most comprehensively automated repositories in the AWF ecosystem. The gap is narrow but targeted at a security product's most critical blind spots.
🔄 Best Practice Comparison
What This Repo Does Exceptionally Well
:00,:05,:10) is a sophisticated comparative testing patternshared/directory withmcp-pagination.md,reporting.md,secret-audit.mdandversion-reporting.mdenables DRY workflow authoringfirewall-issue-dispatcherintegratinggh-aw↔gh-aw-firewallis an advanced patternWhat to Improve
📝 Notes & Tracking
Cache updated:
/tmp/gh-aw/cache-memory/repo-analysis-2026-04-07.jsonwith workflow inventory and gap analysis.Items to track on next run:
codex-token-usage-analyzeradded? (P0)performance-monitor.ymlbeen upgraded with a regression detector? (P0)ci-doctormonitored workflow list been updated to include any new workflows added since last analysis?Run ID: 24077494180 | Date: 2026-04-07
Beta Was this translation helpful? Give feedback.
All reactions