[copilot-cli-research] Copilot CLI Deep Research - 2026-03-22 #22322
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-03-23T21:23:28.507Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Analysis Date: 2026-03-22
Repository: github/gh-aw
Scope: 177 total workflows, ~84+ using Copilot engine (default engine when not specified)
Run: §23412651657
📊 Executive Summary
This is the first comprehensive audit of Copilot CLI feature adoption across all 177 agentic workflows in this repository. The findings reveal strong adoption of core features (safe-outputs at 96%, cache-memory at 32%) but significant under-utilization of advanced Copilot capabilities.
The most striking finding: 7 of 9 custom agent files (
.github/agents/) are deployed but never used in any production workflow — including purpose-built agents likegrumpy-reviewer,contribution-checker, andw3c-specification-writer. This represents immediate, zero-cost quality improvements waiting to be unlocked.The second major finding:
max-continuations(autopilot mode) is used in only 1 of 177 workflows, despite being the key mechanism for Copilot CLI to complete multi-step tasks reliably. Many complex code-editing and analysis workflows likely hit the single-turn limit unnecessarily.🔴 High Priority Issues
1. max-continuations nearly unused (1/177 workflows)
The
--autopilot/max-continuationsfeature allows Copilot to continue working autonomously across multiple turns. Onlysmoke-copilot.mduses it. Workflows likecode-scanning-fixer,hourly-ci-cleaner, and any code-editing workflow likely fail or produce incomplete results due to hitting the single-turn limit.2. 7 custom agent files deployed but unused in workflows
The
.github/agents/directory contains specialized agents —grumpy-reviewer,contribution-checker,w3c-specification-writer,create-safe-output-type,custom-engine-implementation,interactive-agent-designer— that are never referenced viaengine.agent:in any workflow. Onlytechnical-doc-writer(×2) andci-cleaner(×1) are actively used.🟡 Medium Priority Opportunities
1. No model optimization (only 7/177 workflows specify a model)
Most workflows default to the organization variable
GH_AW_MODEL_AGENT_COPILOT. Simple/lightweight tasks (categorization, daily facts, trivial checks) could explicitly usemodel: gpt-5.1-codex-minito reduce costs significantly.2. mcp-scripts used in only 1 workflow
The
mcp-scriptsfeature allows injecting custom tool scripts into the MCP server. Onlysecurity-review.mduses it. This powerful capability for extending agent tooling is essentially untapped.3. Serena tool (code analysis) in only 6 workflows
Serena provides language-aware code analysis via MCP. Only 6 workflows use it despite being ideal for any Go code quality, refactoring, or analysis workflow.
1️⃣ Current State Analysis
View Copilot CLI Capabilities Inventory
Core Configuration Fields (engine block)
engine: copilotengine.id: copilotengine.versionlatest)engine.modelCOPILOT_MODELenv varengine.agent.agent.mdfile--agentflagengine.args--promptengine.envengine.commandengine.api-targetengine.max-continuations--autopilot --max-autopilot-continuesAvailable CLI Flags (auto-generated)
--add-dir/tmp/gh-aw/,$GITHUB_WORKSPACE, cache dirs--log-level all--log-dir/tmp/gh-aw/sandbox/agent/logs/--disable-builtin-mcps--allow-tool--allow-all-tools--allow-all-paths--agent--autopilot--max-autopilot-continues--share--promptAvailable Tools & Integrations
bash*wildcard)edit--allow-all-paths)githubweb-fetchweb_fetch)playwrightserenacache-memorymcp-scriptsagentic-workflowsFeature Flags
copilot-requests: truegithub.tokeninstead ofCOPILOT_GITHUB_TOKENmcp-gateway: trueView Usage Statistics
Overall Feature Adoption
safe-outputstimeout-minutesgithubtooledittoolnetworkconfigcache-memorybash: ["*"]wildcardversionpinningweb-fetchsandboxcopilot-requestsfeatureserenaplaywrightmodelselectionengine.agent(custom)max-continuationsmcp-scriptsengine.api-targetEngine Distribution
Custom Agent Files Available vs. Used
technical-doc-writerci-cleaneragentic-workflowscontribution-checkercreate-safe-output-typecustom-engine-implementationgrumpy-reviewerinteractive-agent-designerw3c-specification-writer2️⃣ Feature Usage Matrix
--add-dir,--log-level,--disable-builtin-mcps,--allow-all-tools,--allow-all-paths,--share*wildcard3️⃣ Missed Opportunities
🔴 High Priority Opportunities
Opportunity 1: Deploy Unused Custom Agent Files
.github/agents/but are never referenced in any workflow viaengine.agent:grumpy-reviewer,contribution-checker,w3c-specification-writer,create-safe-output-type,custom-engine-implementation,interactive-agent-designergrumpy-reviewer.agent.md→ code review workflows (pr-nitpick-reviewer.md,code-simplifier.md,semantic-function-refactor.md)contribution-checker.agent.md→ PR triage/review workflows (contribution-check.md)w3c-specification-writer.agent.md→ documentation workflowsOpportunity 2: Enable max-continuations for Complex Workflows
smoke-copilot.mdusesmax-continuations: 2. Multi-step code editing and analysis tasks benefit greatly from autonomous continuation.max-continuations: 3means up to 4 total turns.code-scanning-fixer.md— fixing security issues requires multiple read/edit cycleshourly-ci-cleaner.md— already notes "15-20 turns for typical CI fixes" but can't use max-turns (Claude only)semantic-function-refactor.md— multi-file refactoringrepository-quality-improver.md— batch improvementscode-simplifier.md— large codebase simplification🟡 Medium Priority Opportunities
Opportunity 3: Model Selection for Cost Optimization
model:. The remaining 170 use the organization default.gpt-5.1-codex-mini) are significantly cheaper for simple classification, tagging, and analysis tasks.changeset.md,ci-doctor.md,daily-fact.md,smoke-call-workflow.mdall usegpt-5.1-codex-miniauto-triage-issues.md,pr-triage-agent.md,ai-moderator.mddaily-issues-report.md,daily-fact.mdpatternOpportunity 4: Expand mcp-scripts Adoption
mcp-scriptsallows injecting custom Node.js scripts as MCP server tools. Onlysecurity-review.mduses it.Opportunity 5: Add Serena to Go Code Analysis Workflows
semantic-function-refactor.md— semantic refactoring needs type infodaily-function-namer.md— naming requires understanding type signaturescode-simplifier.md— simplification requires understanding dependenciescli-consistency-checker.md— needs to understand Go interfacesOpportunity 6: Increase copilot-requests Feature Adoption
features.copilot-requests: true, which simplifies authentication by usinggithub.tokeninstead of requiringCOPILOT_GITHUB_TOKENsecretgithub.tokenscope🟢 Low Priority Opportunities
Opportunity 7: Specific GitHub MCP Toolsets for Minimal Permission Principle
[default]or[all]when they only need specific toolscode-scanning-fixer.mdusestoolsets: [context, repos, code_security, pull_requests]— this is the ideal patterntoolsets: [issues]not[default]write_fileorcreate_issuepermissionsOpportunity 8: Custom env Variables for Workflow Tuning
engine.envallows passing custom environment variables that can tune behavior without modifying the workflow promptengine.envfor Copilot (the field exists but is rarely used)4️⃣ Specific Workflow Recommendations
View Workflow-Specific Recommendations
code-scanning-fixer.md(230 lines)max-continuationsmax-continuations: 3— code scanning fixes are multi-file, multi-stephourly-ci-cleaner.md(complex)ci-cleaneragent (✅ good!), targets 15-20 turns but lacks autopilotmax-continuations: 4— the workflow explicitly comments "15-20 turns for typical CI fixes"max-continuationsis the Copilot equivalentcontribution-check.md(188 lines)github: { toolsets: [default] }, no custom agentengine.agent: contribution-checker— purpose-built agent exists!pr-nitpick-reviewer.mdengine.agent: grumpy-reviewer— purpose-built agent exists with 40+ years seniority personasemantic-function-refactor.md(407 lines as code-simplifier)serena: { languages: { go: {} } }+max-continuations: 3daily-issues-report.mdmodel: gpt-5.1-codex-mini— daily summaries don't need premium modelauto-triage-issues.md(267 lines)model: gpt-5.1-codex-mini+max-continuations: 2for batch issue processing5️⃣ Trends & Insights
View Historical Context
This is the first comprehensive analysis of Copilot CLI usage in this repository. Future analyses (stored in the
memory/copilot-cli-researchbranch) will track trends.Baseline metrics established (2026-03-22):
Feature maturity assessment:
6️⃣ Best Practice Guidelines
Based on this research:
Match model to task complexity: Use
model: gpt-5.1-codex-minifor lightweight classification, tagging, or simple analysis. Reserve default (powerful) models for complex reasoning, code generation, and multi-file refactoring.Enable autopilot for multi-step tasks: Any workflow that involves reading multiple files, making sequential edits, or running iterative analysis should use
max-continuations: 2or higher. Single-turn is appropriate only for truly atomic tasks.Leverage custom agent files: The
.github/agents/directory is a productivity multiplier. Matching workflows to purpose-built agents (grumpy-reviewerfor code review,contribution-checkerfor PRs) dramatically improves consistency and quality.Scope GitHub toolsets narrowly: Follow
code-scanning-fixer.md's example — specify only the toolsets your workflow actually needs. This limits attack surface and reduces unnecessary API calls.Add serena to Go workflows: Any workflow doing Go code analysis, refactoring, or quality checks should include
serena: { languages: { go: {} } }for language-aware understanding.7️⃣ Action Items
Immediate Actions (this week):
contribution-checker.agent.mdtocontribution-check.mdworkflowgrumpy-reviewer.agent.mdtopr-nitpick-reviewer.mdmax-continuations: 3tocode-scanning-fixer.mdmax-continuations: 4tohourly-ci-cleaner.mdShort-term (this month):
model: gpt-5.1-codex-miniopportunitiesserenato Go analysis workflows (semantic-function-refactor,daily-function-namer,go-logger)pr-triage.agent.md)Long-term (this quarter):
View Research Methodology & Evidence
Research Methodology
Phase 1 — Capabilities Inventory: Reviewed
pkg/workflow/copilot_engine.go,copilot_engine_execution.go,copilot_engine_tools.go,copilot_mcp.go, anddocs/src/content/docs/reference/engines.mdto build a complete feature inventory.Phase 2 — Usage Analysis: Used
grepacross all 177.github/workflows/*.mdfiles to count feature adoption. Manually inspected sample workflows to understand patterns.Phase 3 — Gap Analysis: Cross-referenced available features against usage counts; examined
.github/agents/directory for deployment status.Phase 4 — Recommendations: Prioritized opportunities by impact × adoption gap.
Data Sources:
pkg/workflow/copilot_engine_execution.go— CLI flags and engine behaviorpkg/workflow/copilot_engine_tools.go— tool permission system.github/agents/*.agent.md— available custom agents (9 files).github/workflows/*.md— all 177 workflow configurationsCHANGELOG.md— feature historydocs/src/content/docs/reference/engines.md— official documentationPersistence: Analysis saved to
memory/copilot-cli-researchbranch for trend tracking in future runs.References:
Beta Was this translation helpful? Give feedback.
All reactions