[copilot-cli-research] Copilot CLI Deep Research - 2026-03-22 #22322

2026-03-22T21:23:28Z

github-actions[bot]
bot Mar 22, 2026

Analysis Date: 2026-03-22
Repository: github/gh-aw
Scope: 177 total workflows, ~84+ using Copilot engine (default engine when not specified)
Run: §23412651657

📊 Executive Summary

This is the first comprehensive audit of Copilot CLI feature adoption across all 177 agentic workflows in this repository. The findings reveal strong adoption of core features (safe-outputs at 96%, cache-memory at 32%) but significant under-utilization of advanced Copilot capabilities.

The most striking finding: 7 of 9 custom agent files (.github/agents/) are deployed but never used in any production workflow — including purpose-built agents like grumpy-reviewer, contribution-checker, and w3c-specification-writer. This represents immediate, zero-cost quality improvements waiting to be unlocked.

The second major finding: max-continuations (autopilot mode) is used in only 1 of 177 workflows, despite being the key mechanism for Copilot CLI to complete multi-step tasks reliably. Many complex code-editing and analysis workflows likely hit the single-turn limit unnecessarily.

🔴 High Priority Issues

1. max-continuations nearly unused (1/177 workflows)
The --autopilot / max-continuations feature allows Copilot to continue working autonomously across multiple turns. Only smoke-copilot.md uses it. Workflows like code-scanning-fixer, hourly-ci-cleaner, and any code-editing workflow likely fail or produce incomplete results due to hitting the single-turn limit.

2. 7 custom agent files deployed but unused in workflows
The .github/agents/ directory contains specialized agents — grumpy-reviewer, contribution-checker, w3c-specification-writer, create-safe-output-type, custom-engine-implementation, interactive-agent-designer — that are never referenced via engine.agent: in any workflow. Only technical-doc-writer (×2) and ci-cleaner (×1) are actively used.

🟡 Medium Priority Opportunities

1. No model optimization (only 7/177 workflows specify a model)
Most workflows default to the organization variable GH_AW_MODEL_AGENT_COPILOT. Simple/lightweight tasks (categorization, daily facts, trivial checks) could explicitly use model: gpt-5.1-codex-mini to reduce costs significantly.

2. mcp-scripts used in only 1 workflow
The mcp-scripts feature allows injecting custom tool scripts into the MCP server. Only security-review.md uses it. This powerful capability for extending agent tooling is essentially untapped.

3. Serena tool (code analysis) in only 6 workflows
Serena provides language-aware code analysis via MCP. Only 6 workflows use it despite being ideal for any Go code quality, refactoring, or analysis workflow.

1️⃣ Current State Analysis

View Copilot CLI Capabilities Inventory

Core Configuration Fields (engine block)

Field	Description	Implementation Status
`engine: copilot`	Select Copilot as engine	✅ Default engine
`engine.id: copilot`	Extended config format	✅ Supported
`engine.version`	Pin Copilot CLI version	✅ Supported (default: `latest`)
`engine.model`	Select AI model	✅ Supported, maps to `COPILOT_MODEL` env var
`engine.agent`	Use custom `.agent.md` file	✅ Supported via `--agent` flag
`engine.args`	Custom CLI arguments	✅ Injected before `--prompt`
`engine.env`	Custom environment variables	✅ Supported
`engine.command`	Custom binary path	✅ Supported
`engine.api-target`	Custom API endpoint	✅ Supported (GHE/GHES)
`engine.max-continuations`	Autopilot mode turns	✅ Maps to `--autopilot --max-autopilot-continues`

Available CLI Flags (auto-generated)

Flag	When Added	Notes
`--add-dir`	Always	`/tmp/gh-aw/`, `$GITHUB_WORKSPACE`, cache dirs
`--log-level all`	Always	Full diagnostic logging
`--log-dir`	Always	`/tmp/gh-aw/sandbox/agent/logs/`
`--disable-builtin-mcps`	Always	Disable default MCP servers
`--allow-tool`	Per tool config	Granular tool permissions
`--allow-all-tools`	bash: ["*"]	Wildcard bash permission
`--allow-all-paths`	edit tool	Write permission for all paths
`--agent`	engine.agent set	Custom agent file
`--autopilot`	max-continuations > 1	Multi-turn autonomous mode
`--max-autopilot-continues`	max-continuations > 1	Turn limit
`--share`	Sandbox mode	Saves conversation to markdown
`--prompt`	Always	The workflow prompt

Available Tools & Integrations

Tool	Purpose
`bash`	Shell commands (specific list or `*` wildcard)
`edit`	File write access (`--allow-all-paths`)
`github`	GitHub MCP server (local or remote mode)
`web-fetch`	Built-in web fetching (`web_fetch`)
`playwright`	Browser automation
`serena`	Language-aware code analysis
`cache-memory`	Persistent cross-run storage
`mcp-scripts`	Custom MCP server scripts
`agentic-workflows`	AWF workflow tools

Feature Flags

Feature Flag	Effect
`copilot-requests: true`	Uses `github.token` instead of `COPILOT_GITHUB_TOKEN`
`mcp-gateway: true`	Enables MCP gateway

View Usage Statistics

Overall Feature Adoption

Feature	Used in	Total Workflows	Adoption Rate
`safe-outputs`	170	177	96%
`timeout-minutes`	169	177	95%
`github` tool	127	177	72%
`edit` tool	79	177	45%
`network` config	86	177	49%
`cache-memory`	57	177	32%
`bash: ["*"]` wildcard	57	177	32%
`version` pinning	19	177	11%
`web-fetch`	17	177	10%
`sandbox`	18	177	10%
`copilot-requests` feature	41	177	23%
`serena`	6	177	3%
`playwright`	12	177	7%
`model` selection	7	177	4%
Extended engine config	9	177	5%
`engine.agent` (custom)	3	177	1.7%
`max-continuations`	1	177	0.6%
`mcp-scripts`	1	177	0.6%
`engine.api-target`	0	177	0%

Engine Distribution

Engine	Count	%
Copilot (explicit + default)	~84+	~47%+
Claude	34	19%
Codex	17	10%
Gemini	0	0%

Custom Agent Files Available vs. Used

Agent File	Used In Workflows
`technical-doc-writer`	2 ✅
`ci-cleaner`	1 ✅
`agentic-workflows`	1 ✅
`contribution-checker`	0 ❌
`create-safe-output-type`	0 ❌
`custom-engine-implementation`	0 ❌
`grumpy-reviewer`	0 ❌
`interactive-agent-designer`	0 ❌
`w3c-specification-writer`	0 ❌

2️⃣ Feature Usage Matrix

Feature Category	Available	Used	Not Used	Adoption
CLI Core	`--add-dir`, `--log-level`, `--disable-builtin-mcps`, `--allow-all-tools`, `--allow-all-paths`, `--share`	All (auto)	-	100% (auto)
Engine Config	version, model, args, env, command, agent, api-target, max-continuations	model(4%), agent(2%), max-cont(0.6%)	api-target, command	~5%
Tool: bash	Specific commands or `*` wildcard	57 wildcard + specific	-	High
Tool: github	local/remote, toolsets, allowed list	127 workflows	-	72%
Tool: serena	language-aware analysis	6 workflows	171	3%
Tool: mcp-scripts	custom MCP scripts	1 workflow	176	0.6%
Feature Flags	copilot-requests, mcp-gateway	copilot-requests: 41	mcp-gateway: rare	23%
Custom Agents	9 agent files	3 workflows use	7 unused	33% of files

3️⃣ Missed Opportunities

🔴 High Priority Opportunities

Opportunity 1: Deploy Unused Custom Agent Files

What: 7 specialized agent files exist in .github/agents/ but are never referenced in any workflow via engine.agent:
Why It Matters: Custom agents carry domain expertise as system prompts, improving output quality without any prompt engineering in the workflow itself
Unused agents: grumpy-reviewer, contribution-checker, w3c-specification-writer, create-safe-output-type, custom-engine-implementation, interactive-agent-designer
Where to Apply:
- grumpy-reviewer.agent.md → code review workflows (pr-nitpick-reviewer.md, code-simplifier.md, semantic-function-refactor.md)
- contribution-checker.agent.md → PR triage/review workflows (contribution-check.md)
- w3c-specification-writer.agent.md → documentation workflows

How to Implement:

engine:
  id: copilot
  agent: grumpy-reviewer   # references .github/agents/grumpy-reviewer.agent.md

Expected Benefit: Higher quality outputs aligned with specific personas/expertise

Opportunity 2: Enable max-continuations for Complex Workflows

What: Only smoke-copilot.md uses max-continuations: 2. Multi-step code editing and analysis tasks benefit greatly from autonomous continuation.
Why It Matters: Without autopilot, Copilot stops after one turn, leaving large tasks incomplete. max-continuations: 3 means up to 4 total turns.
Where to Apply:
- code-scanning-fixer.md — fixing security issues requires multiple read/edit cycles
- hourly-ci-cleaner.md — already notes "15-20 turns for typical CI fixes" but can't use max-turns (Claude only)
- semantic-function-refactor.md — multi-file refactoring
- repository-quality-improver.md — batch improvements
- code-simplifier.md — large codebase simplification

How to Implement:

engine:
  id: copilot
  max-continuations: 3   # allows up to 4 total turns

Expected Benefit: Task completion rates improve dramatically for multi-step workflows

🟡 Medium Priority Opportunities

Opportunity 3: Model Selection for Cost Optimization

What: Only 7 workflows explicitly set model:. The remaining 170 use the organization default.
Why It Matters: Lightweight models (gpt-5.1-codex-mini) are significantly cheaper for simple classification, tagging, and analysis tasks.
Pattern Already Established: changeset.md, ci-doctor.md, daily-fact.md, smoke-call-workflow.md all use gpt-5.1-codex-mini
Where to Apply:
- Simple classification/tagging: auto-triage-issues.md, pr-triage-agent.md, ai-moderator.md
- Daily low-complexity reports: daily-issues-report.md, daily-fact.md pattern
- Short analysis tasks with limited scope

How to Implement:

# For lightweight tasks
engine:
  id: copilot
  model: gpt-5.1-codex-mini

# For complex reasoning tasks (keep default or use premium)
# engine: copilot  (no model = org default)

Expected Benefit: Cost reduction of 60-80% for lightweight task categories

Opportunity 4: Expand mcp-scripts Adoption

What: mcp-scripts allows injecting custom Node.js scripts as MCP server tools. Only security-review.md uses it.
Why It Matters: Enables stateful custom tools without external server infrastructure — ideal for domain-specific data processing
Where to Apply:
- Workflows that currently use complex bash scripts for data processing
- Workflows needing structured GitHub API queries
- Workflows with repeated tool patterns that could be encapsulated

How to Implement:

mcp-scripts:
  scripts:
    - name: analyze-pr-diff
      path: .github/scripts/analyze-pr.js

Opportunity 5: Add Serena to Go Code Analysis Workflows

What: Only 6 workflows use Serena despite this being a Go repository with extensive Go code
Why It Matters: Serena provides language-server-protocol-based code understanding (type info, references, call graphs) that dramatically improves Go analysis quality
Where to Apply:
- semantic-function-refactor.md — semantic refactoring needs type info
- daily-function-namer.md — naming requires understanding type signatures
- code-simplifier.md — simplification requires understanding dependencies
- cli-consistency-checker.md — needs to understand Go interfaces

How to Implement:

tools:
  serena:
    languages:
      go: {}

Opportunity 6: Increase copilot-requests Feature Adoption

What: Only 41/177 workflows (23%) use features.copilot-requests: true, which simplifies authentication by using github.token instead of requiring COPILOT_GITHUB_TOKEN secret
Why It Matters: Simpler auth setup, no separate secret management, works with standard GitHub Actions token
Where to Apply: Any workflow that doesn't need special Copilot permissions beyond standard github.token scope
How to Implement:
```
features:
  copilot-requests: true
```

🟢 Low Priority Opportunities

Opportunity 7: Specific GitHub MCP Toolsets for Minimal Permission Principle

What: Many workflows use broad toolsets like [default] or [all] when they only need specific tools
Why It Matters: Principle of least privilege — limit what tools the agent can call to its actual task needs
Pattern Already Good: code-scanning-fixer.md uses toolsets: [context, repos, code_security, pull_requests] — this is the ideal pattern
Where to Tighten:
- Workflows only reading issues should use toolsets: [issues] not [default]
- Read-only analysis workflows should avoid write_file or create_issue permissions

Opportunity 8: Custom env Variables for Workflow Tuning

What: engine.env allows passing custom environment variables that can tune behavior without modifying the workflow prompt
Why It Matters: Enables runtime configuration without recompilation (e.g., DEBUG flags, feature gates)
Current Usage: ~0 workflows use engine.env for Copilot (the field exists but is rarely used)

How to Implement:

engine:
  id: copilot
  env:
    WORKFLOW_DEBUG: "true"
    ANALYSIS_DEPTH: "deep"

4️⃣ Specific Workflow Recommendations

View Workflow-Specific Recommendations

`code-scanning-fixer.md` (230 lines)

Current: Uses GitHub toolsets well, but no max-continuations
Recommended: Add max-continuations: 3 — code scanning fixes are multi-file, multi-step
Expected: Fewer incomplete fixes

`hourly-ci-cleaner.md` (complex)

Current: Uses ci-cleaner agent (✅ good!), targets 15-20 turns but lacks autopilot
Recommended: Add max-continuations: 4 — the workflow explicitly comments "15-20 turns for typical CI fixes"
Caution: Note says "max-turns not available for Copilot engine (Claude only)" — max-continuations is the Copilot equivalent

`contribution-check.md` (188 lines)

Current: Uses github: { toolsets: [default] }, no custom agent
Recommended: Add engine.agent: contribution-checker — purpose-built agent exists!
Expected: More thorough, consistent contribution assessments

`pr-nitpick-reviewer.md`

Current: Standard config
Recommended: Add engine.agent: grumpy-reviewer — purpose-built agent exists with 40+ years seniority persona
Expected: Higher quality, more consistent review tone

`semantic-function-refactor.md` (407 lines as code-simplifier)

Current: No serena, no max-continuations
Recommended: Add serena: { languages: { go: {} } } + max-continuations: 3
Expected: Semantically-aware refactoring using type information

`daily-issues-report.md`

Current: No model specified (uses org default)
Recommended: Add model: gpt-5.1-codex-mini — daily summaries don't need premium model
Expected: ~60-80% cost reduction for this daily workflow

`auto-triage-issues.md` (267 lines)

Current: No model specified
Recommended: model: gpt-5.1-codex-mini + max-continuations: 2 for batch issue processing
Expected: Cost savings + more complete batch processing

5️⃣ Trends & Insights

View Historical Context

This is the first comprehensive analysis of Copilot CLI usage in this repository. Future analyses (stored in the memory/copilot-cli-research branch) will track trends.

Baseline metrics established (2026-03-22):

Agent file utilization: 2/9 files deployed (22%)
max-continuations adoption: 1/177 (0.6%)
Model optimization adoption: 7/177 (4%)
copilot-requests adoption: 41/177 (23%)

Feature maturity assessment:

Mature & well-adopted: safe-outputs (96%), timeout-minutes (95%), github tool (72%)
Growing: cache-memory (32%), copilot-requests (23%)
Under-adopted given value: max-continuations (0.6%), custom agents (2%)
Niche/specialized: mcp-scripts (0.6%), serena (3%), playwright (7%)

6️⃣ Best Practice Guidelines

Based on this research:

Match model to task complexity: Use model: gpt-5.1-codex-mini for lightweight classification, tagging, or simple analysis. Reserve default (powerful) models for complex reasoning, code generation, and multi-file refactoring.
Enable autopilot for multi-step tasks: Any workflow that involves reading multiple files, making sequential edits, or running iterative analysis should use max-continuations: 2 or higher. Single-turn is appropriate only for truly atomic tasks.
Leverage custom agent files: The .github/agents/ directory is a productivity multiplier. Matching workflows to purpose-built agents (grumpy-reviewer for code review, contribution-checker for PRs) dramatically improves consistency and quality.
Scope GitHub toolsets narrowly: Follow code-scanning-fixer.md's example — specify only the toolsets your workflow actually needs. This limits attack surface and reduces unnecessary API calls.
Add serena to Go workflows: Any workflow doing Go code analysis, refactoring, or quality checks should include serena: { languages: { go: {} } } for language-aware understanding.

7️⃣ Action Items

Immediate Actions (this week):

Wire contribution-checker.agent.md to contribution-check.md workflow
Wire grumpy-reviewer.agent.md to pr-nitpick-reviewer.md
Add max-continuations: 3 to code-scanning-fixer.md
Add max-continuations: 4 to hourly-ci-cleaner.md

Short-term (this month):

Audit 10+ lightweight daily workflows for model: gpt-5.1-codex-mini opportunities
Add serena to Go analysis workflows (semantic-function-refactor, daily-function-namer, go-logger)
Wire remaining purpose-built agents to matching workflows
Consider creating agent files for remaining workflow categories (e.g., a pr-triage.agent.md)

Long-term (this quarter):

Establish model selection guidelines (which tasks → which model tier)
Explore mcp-scripts for reusable tool patterns across workflow families
Consider agent files as the standard for complex recurring workflows
Develop a "workflow health score" that tracks all these metrics automatically

View Research Methodology & Evidence

Research Methodology

Phase 1 — Capabilities Inventory: Reviewed pkg/workflow/copilot_engine.go, copilot_engine_execution.go, copilot_engine_tools.go, copilot_mcp.go, and docs/src/content/docs/reference/engines.md to build a complete feature inventory.

Phase 2 — Usage Analysis: Used grep across all 177 .github/workflows/*.md files to count feature adoption. Manually inspected sample workflows to understand patterns.

Phase 3 — Gap Analysis: Cross-referenced available features against usage counts; examined .github/agents/ directory for deployment status.

Phase 4 — Recommendations: Prioritized opportunities by impact × adoption gap.

Data Sources:

pkg/workflow/copilot_engine_execution.go — CLI flags and engine behavior
pkg/workflow/copilot_engine_tools.go — tool permission system
.github/agents/*.agent.md — available custom agents (9 files)
.github/workflows/*.md — all 177 workflow configurations
CHANGELOG.md — feature history
docs/src/content/docs/reference/engines.md — official documentation

Persistence: Analysis saved to memory/copilot-cli-research branch for trend tracking in future runs.

References:

§23412651657

AI generated by Copilot CLI Deep Research Agent · history

expires on Mar 23, 2026, 9:23 PM UTC

2026-03-23T22:51:41Z

github-actions[bot]
bot Mar 23, 2026
Author

This discussion was automatically closed because it expired on 2026-03-23T21:23:28.507Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - 2026-03-22 #22322

Uh oh!

{{title}}

Uh oh!

Core Configuration Fields (engine block)

Available CLI Flags (auto-generated)

Available Tools & Integrations

Feature Flags

Overall Feature Adoption

Engine Distribution

Custom Agent Files Available vs. Used

Opportunity 1: Deploy Unused Custom Agent Files

Opportunity 2: Enable max-continuations for Complex Workflows

Opportunity 3: Model Selection for Cost Optimization

Opportunity 4: Expand mcp-scripts Adoption

Opportunity 5: Add Serena to Go Code Analysis Workflows

Opportunity 6: Increase copilot-requests Feature Adoption

Opportunity 7: Specific GitHub MCP Toolsets for Minimal Permission Principle

Opportunity 8: Custom env Variables for Workflow Tuning

`code-scanning-fixer.md` (230 lines)

`hourly-ci-cleaner.md` (complex)

`contribution-check.md` (188 lines)

`pr-nitpick-reviewer.md`

`semantic-function-refactor.md` (407 lines as code-simplifier)

`daily-issues-report.md`

`auto-triage-issues.md` (267 lines)

Research Methodology

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - 2026-03-22 #22322

Uh oh!

github-actions[bot] bot Mar 22, 2026

📊 Executive Summary

🔴 High Priority Issues

🟡 Medium Priority Opportunities

1️⃣ Current State Analysis

Core Configuration Fields (engine block)

Available CLI Flags (auto-generated)

Available Tools & Integrations

Feature Flags

Overall Feature Adoption

Engine Distribution

Custom Agent Files Available vs. Used

2️⃣ Feature Usage Matrix

3️⃣ Missed Opportunities

Opportunity 1: Deploy Unused Custom Agent Files

Opportunity 2: Enable max-continuations for Complex Workflows

Opportunity 3: Model Selection for Cost Optimization

Opportunity 4: Expand mcp-scripts Adoption

Opportunity 5: Add Serena to Go Code Analysis Workflows

Opportunity 6: Increase copilot-requests Feature Adoption

Opportunity 7: Specific GitHub MCP Toolsets for Minimal Permission Principle

Opportunity 8: Custom env Variables for Workflow Tuning

4️⃣ Specific Workflow Recommendations

code-scanning-fixer.md (230 lines)

hourly-ci-cleaner.md (complex)

contribution-check.md (188 lines)

pr-nitpick-reviewer.md

semantic-function-refactor.md (407 lines as code-simplifier)

daily-issues-report.md

auto-triage-issues.md (267 lines)

5️⃣ Trends & Insights

6️⃣ Best Practice Guidelines

7️⃣ Action Items

Research Methodology

Replies: 1 comment

Uh oh!

github-actions[bot] bot Mar 23, 2026 Author

github-actions[bot]
bot Mar 22, 2026

`code-scanning-fixer.md` (230 lines)

`hourly-ci-cleaner.md` (complex)

`contribution-check.md` (188 lines)

`pr-nitpick-reviewer.md`

`semantic-function-refactor.md` (407 lines as code-simplifier)

`daily-issues-report.md`

`auto-triage-issues.md` (267 lines)

github-actions[bot]
bot Mar 23, 2026
Author