Investigate commands on PATH before answering / autofixing (#287) by vanzue · Pull Request #306 · microsoft/intelligent-terminal

vanzue · 2026-06-16T23:56:20Z

Fixes #287 — asking the agent "How do I use X" gave the same generic "use help / Get-Command" advice whether or not X exists, ignoring local PowerShell scripts and programs on PATH and never flagging a non-existent command.

The reporter's scenario is a question to the agent ("How do I use X"), which routes to the agent-pane planner — so the fix is primarily there (Part 1). A related autofix improvement rides along (Part 2).

Part 1 — Agent pane: investigate before answering "how do I use X"

terminal-agent.md routed these questions to Chat mode ("No tool calls"), so the agent answered from general knowledge without checking the live environment. Now, a question about a specific local command requires a quick read-only investigation first, taught via one worked sample:

Check existence + what it is: Get-Command X (meets the issue's minimum — "say there is no such command" — in any locale).
If it exists, learn real usage from the source of truth (Get-Help / -? / --help, or read the script).
Answer from evidence.

For "did you mean", the agent searches the real command list with a wildcard stem (Get-Command -Name "*depl*" / compgen -c | grep) and picks the closest itself — the LLM is a better judge of a likely typo than Get-Command -UseFuzzyMatching, which (measured on PS 7.6.2) buries PATH applications/scripts (git ranked ~300th for gti; deploy-it not returned for deploit). No fixed matcher, no new tooling — works for every agent and shell.

Part 2 — Autofix: local PATH near-matches for not-found commands

When a run command fails as not-found, command_recall injects a ### Near Matches section so the agent can suggest the local script/program the user mistyped. This runs in-process inside wta (which assembles the autofix prompt), so it can use a real ranking engine:

Locale-independent existence gate (ask the shell, never match the localized error text); a cheap in-process which pre-gate skips the common case with no subprocess.
Rank the real command list by Damerau-Levenshtein (strsim), with an anagram tie-break so a transposition (gti→git) outranks an equidistant substitution (gti→gci).

auto-fix.md consumes ### Near Matches and states plainly when a command isn't on PATH.

Why the split (delivery reality)

The agent pane can't call a wta "tool": there is no MCP server in the codebase (the wta mcp mode in AGENTS.md is stale — integration is ACP-only), and wta.exe is not on PATH (only wtcli and wtai=Windows Terminal have AppExecutionAliases). Exposing a CLI/MCP resolver to the agent would need disproportionate C++/packaging work, so Part 1 stays prompt-only (LLM over real shell output) and the strsim engine lives in Part 2, where wta runs it itself.

strsim is promoted to a direct dep; it was already in the lockfile via clap, so the resolved graph (and cgmanifest.json/NOTICE.md) is unchanged.

Known blind spot (v1): the autofix enumerate runs -NoProfile, so it sees PATH executables/scripts (the issue's concern) but not profile-only interactive functions/aliases.

Test

cargo build ✅ (only pre-existing warnings)
cargo test ✅ 914 passed / 0 failed, including 11 command_recall unit tests (token extraction, extension stripping, existence gate, transposition/anagram ranking, threshold rejection, dedup/cap).

The terminal agent routed "how do I use X" / "what is X" questions to Chat mode, which answers from general knowledge with "No tool calls" — so it gave the same "use help / Get-Command" boilerplate whether or not X actually exists on the user's machine, ignoring local PowerShell scripts and programs on PATH (issue #287). Reframe these as a chat answer that requires read-only investigation first, taught via a single worked sample rather than a rigid per-shell rule table: look at the PATH (Get-Command), try the command's own help to learn its usage, then answer from evidence. When nothing is found, say so plainly and offer real near-matches from the PATH via Get-Command -UseFuzzyMatching (reusing PowerShell's own resolver/fuzzy engine instead of inventing spelling correction). Prompt-only change to the embedded terminal-agent.md; seed_prompt_files migrates it to users on next start. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR updates the embedded Terminal Agent system prompt to ensure that “how do I use X” / “what is X” questions trigger a quick read-only investigation of whether X exists on the user’s machine, rather than defaulting to generic guidance.

Changes:

Refines Chat mode guidance to allow an exception where command-specific questions require investigation before answering.
Adds a new “Chat answers that need investigation” section with a worked example showing how to resolve a command, query its help/usage, and answer from evidence.
Updates the response-format section to reiterate that local-command chat questions should be investigated first.

autom8edIT · 2026-06-17T00:29:43Z

Problem

Fixes #287. Asking the terminal agent "How do I use X" gave the same generic "use help / Get-Command" advice regardless of whether X actually exists on the machine — local PowerShell scripts and programs on PATH were ignored, and a non-existent command was never flagged.

Root cause: the system prompt routes "how do I use X" / "what is X" into Chat mode, which is documented as "No tool calls" → the agent answers from general knowledge and never checks the live environment.

Change

Prompt-only edit to the embedded tools/wta/prompts/terminal-agent.md (no code). These questions are reframed as a chat answer that requires a quick read-only investigation first, taught via one worked sample rather than a rigid per-shell rule table:

Look at the PATH — Get-Command X — to see whether it exists and what it is.

Try its own help to learn real usage (Get-Help / -? / --help), don't guess.

Answer from evidence (real kind + path).

When nothing is found: say so plainly and offer real near-matches from the PATH via Get-Command -UseFuzzyMatching — reusing PowerShell's own resolver and fuzzy engine instead of inventing spelling correction.

This stays a chat answer (no JSON card) and adapts to the pane's shell. It's the embedded default, so seed_prompt_files migrates it to users' on-disk default (and unedited user prompt) on next start.

Why a sample, not a rule table

An earlier draft prescribed exact commands and strict found/not-found branches per shell — too rigid. A single canonical example teaches the shape (recognize you don't know → investigate → try usage → answer from evidence) and lets the agent generalize.

Test

cargo build ✅ (only pre-existing warnings)

cargo test ✅ 903 passed / 0 failed (incl. prompt.rs merge_runtime_sections tests; the embedded prompt is include_str!'d so it must still compile)

Just want to say I really appreciate the work you put in! Since I need Windws, Linux and macOS I need a nice terminal and conhost, even if it's a legend, lacks agent feature!

God Bless! 👌👑👑👑

The autofix prompt could correct typos it knew from training data (`dotent`->`dotnet`) but never suggested the *local* PowerShell scripts and programs on the user's PATH that they most likely mistyped (`deploit`->`deploy-it`), because the LLM has no knowledge of the user's environment. Add `command_recall`, which grounds "did you mean" suggestions in the user's real PATH. In the autofix prompt-assembly path (PowerShell only in v1): - Extract the failing command token from the captured [command + output] buffer (ControlCore::ReadLastPrompt starts at the FTCS command mark, so there is no prompt prefix to strip). - Gate locale-independently on existence, not on the (localized) error text: a cheap in-process `which` pre-gate short-circuits the common case (failed build/test/git where the program exists) with no subprocess; only a genuine not-found pays one `Get-Command` enumerate. - Rank the real command list by Damerau-Levenshtein (strsim), with an anagram tie-break so a transposition (`gti`->`git`) outranks an equidistant substitution (`gti`->`gci`). Inject the top matches as a `### Near Matches` section. `auto-fix.md` is updated to consume `### Near Matches` and to state plainly when a command isn't on PATH instead of giving generic "check spelling / use help" advice. strsim is promoted to a direct dependency; it was already in the lockfile via clap, so the resolved graph (and cgmanifest/NOTICE) is unchanged. Known blind spot (accepted for v1): the enumerate runs `-NoProfile`, so it sees PATH executables/scripts (the issue's concern) but not profile-only interactive functions/aliases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…hing The investigation sample told the agent to spelling-correct an unknown command with `Get-Command -UseFuzzyMatching`, but on PS 7.6.2 that ranking buries PATH applications and external scripts (git ranked ~300th for `gti`; `deploy-it` not returned for `deploit`) — exactly the local scripts/programs issue #287 is about. Replace it with a PATH-grounded approach that needs no fixed matcher: the agent searches the real command list with a wildcard stem (`Get-Command -Name "*depl*"` / `compgen -c | grep`) and picks the closest itself — the LLM is a better judge of a likely typo than the built-in fuzzy ranker. Existence is still confirmed with `Get-Command`, so the issue's minimum ("say there is no such command") holds in any locale. (The strsim-backed near-match engine stays in the autofix path, where wta assembles the prompt in-process and doesn't depend on any agent-callable tool. There is no MCP server and `wta` isn't on PATH, so exposing a CLI/MCP tool to the agent pane isn't viable without disproportionate C++/packaging work.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.

…, wording Address Copilot review feedback on PR #306: - strip_exe_ext: guard the slice with str::get so a non-ASCII command name (Unicode function/alias) whose byte boundary lands mid-char no longer panics and crashes autofix prompt assembly. - command_exists / rank_near_matches: strip the user-typed token's own executable extension too, so an explicit 'deploy-it.ps1' matches the stripped candidate instead of being misreported as not-found. - enumerate_powershell_commands: include Cmdlet in the Get-Command existence gate so a failing cmdlet invocation (e.g. Get-Item with a bad path) isn't misclassified as a not-found command. - Reword the injected '### Near Matches' section and auto-fix.md to say 'not found as a command in this shell' instead of 'on PATH', since the list comes from Get-Command (aliases/functions/cmdlets), not just PATH. Adds 3 regression unit tests (non-ASCII no-panic, token-with-extension existence, token-extension ranking). cargo test: 917 passed / 0 failed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… help Address remaining Copilot review feedback on PR #306: - Chat-mode investigation: make explicit that the agent runs the read-only probe via its OWN tools (execute_command / view / read_text_file) and must NOT emit a recommendation card asking the user to run it. Resolves the concern that Chat mode (No JSON) had no stated path to actually execute the investigation. - Usage discovery: prefer Get-Help and reading the script's param block (non-executing) as the source of truth; demote the command's own help flag (-? / --help) to a side-effect-aware last resort, since a plain script may run its body before any help check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

Two unresolved Copilot review threads: - client.rs: wrap each \### Near Matches\ candidate in inline-code so command names containing Markdown-significant chars (_, *) are read as literals, not formatting. - terminal-agent.md: Get-Command resolves more than PATH (cmdlets/functions/aliases), so reword the sample to describe the command by its actual type and avoid claiming it isn't 'on PATH' when it may be a function/alias. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings June 16, 2026 23:56

Copilot started reviewing on behalf of vanzue June 16, 2026 23:57 View session

Copilot AI reviewed Jun 16, 2026

View reviewed changes

Comment thread tools/wta/prompts/terminal-agent.md Outdated

Comment thread tools/wta/prompts/terminal-agent.md Outdated

Comment thread tools/wta/prompts/terminal-agent.md Outdated

vanzue changed the title ~~Investigate a named command before answering "how do I use X" (#287)~~ Investigate commands on PATH before answering / autofixing (#287) Jun 17, 2026

Copilot AI review requested due to automatic review settings June 17, 2026 03:31

Copilot started reviewing on behalf of vanzue June 17, 2026 03:31 View session

Copilot AI reviewed Jun 17, 2026

View reviewed changes

Comment thread tools/wta/src/command_recall.rs

Comment thread tools/wta/src/command_recall.rs

Comment thread tools/wta/src/command_recall.rs Outdated

Comment thread tools/wta/src/protocol/acp/client.rs

Comment thread tools/wta/prompts/auto-fix.md Outdated

vanzue and others added 2 commits June 18, 2026 10:41

Copilot AI review requested due to automatic review settings June 18, 2026 08:34

Copilot started reviewing on behalf of vanzue June 18, 2026 08:36 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Comment thread tools/wta/prompts/terminal-agent.md Outdated

Comment thread tools/wta/src/protocol/acp/client.rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate commands on PATH before answering / autofixing (#287)#306

Investigate commands on PATH before answering / autofixing (#287)#306
vanzue wants to merge 6 commits into
mainfrom
dev/vanzue/agent-investigate-unknown-command

vanzue commented Jun 16, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

autom8edIT commented Jun 17, 2026

Problem

Change

Why a sample, not a rule table

Test

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vanzue commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Part 1 — Agent pane: investigate before answering "how do I use X"

Part 2 — Autofix: local PATH near-matches for not-found commands

Why the split (delivery reality)

Test

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

autom8edIT commented Jun 17, 2026

Problem

Change

Why a sample, not a rule table

Test

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vanzue commented Jun 16, 2026 •

edited

Loading