Skip to content

Investigate commands on PATH before answering / autofixing (#287)#306

Open
vanzue wants to merge 6 commits into
mainfrom
dev/vanzue/agent-investigate-unknown-command
Open

Investigate commands on PATH before answering / autofixing (#287)#306
vanzue wants to merge 6 commits into
mainfrom
dev/vanzue/agent-investigate-unknown-command

Conversation

@vanzue

@vanzue vanzue commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Fixes #287 — asking the agent "How do I use X" gave the same generic "use help / Get-Command" advice whether or not X exists, ignoring local PowerShell scripts and programs on PATH and never flagging a non-existent command.

The reporter's scenario is a question to the agent ("How do I use X"), which routes to the agent-pane planner — so the fix is primarily there (Part 1). A related autofix improvement rides along (Part 2).

Part 1 — Agent pane: investigate before answering "how do I use X"

terminal-agent.md routed these questions to Chat mode ("No tool calls"), so the agent answered from general knowledge without checking the live environment. Now, a question about a specific local command requires a quick read-only investigation first, taught via one worked sample:

  1. Check existence + what it is: Get-Command X (meets the issue's minimum — "say there is no such command" — in any locale).
  2. If it exists, learn real usage from the source of truth (Get-Help / -? / --help, or read the script).
  3. Answer from evidence.

For "did you mean", the agent searches the real command list with a wildcard stem (Get-Command -Name "*depl*" / compgen -c | grep) and picks the closest itself — the LLM is a better judge of a likely typo than Get-Command -UseFuzzyMatching, which (measured on PS 7.6.2) buries PATH applications/scripts (git ranked ~300th for gti; deploy-it not returned for deploit). No fixed matcher, no new tooling — works for every agent and shell.

Part 2 — Autofix: local PATH near-matches for not-found commands

When a run command fails as not-found, command_recall injects a ### Near Matches section so the agent can suggest the local script/program the user mistyped. This runs in-process inside wta (which assembles the autofix prompt), so it can use a real ranking engine:

  • Locale-independent existence gate (ask the shell, never match the localized error text); a cheap in-process which pre-gate skips the common case with no subprocess.
  • Rank the real command list by Damerau-Levenshtein (strsim), with an anagram tie-break so a transposition (gtigit) outranks an equidistant substitution (gtigci).

auto-fix.md consumes ### Near Matches and states plainly when a command isn't on PATH.

Why the split (delivery reality)

The agent pane can't call a wta "tool": there is no MCP server in the codebase (the wta mcp mode in AGENTS.md is stale — integration is ACP-only), and wta.exe is not on PATH (only wtcli and wtai=Windows Terminal have AppExecutionAliases). Exposing a CLI/MCP resolver to the agent would need disproportionate C++/packaging work, so Part 1 stays prompt-only (LLM over real shell output) and the strsim engine lives in Part 2, where wta runs it itself.

strsim is promoted to a direct dep; it was already in the lockfile via clap, so the resolved graph (and cgmanifest.json/NOTICE.md) is unchanged.

Known blind spot (v1): the autofix enumerate runs -NoProfile, so it sees PATH executables/scripts (the issue's concern) but not profile-only interactive functions/aliases.

Test

  • cargo build ✅ (only pre-existing warnings)
  • cargo test914 passed / 0 failed, including 11 command_recall unit tests (token extraction, extension stripping, existence gate, transposition/anagram ranking, threshold rejection, dedup/cap).

The terminal agent routed "how do I use X" / "what is X" questions to
Chat mode, which answers from general knowledge with "No tool calls" — so
it gave the same "use help / Get-Command" boilerplate whether or not X
actually exists on the user's machine, ignoring local PowerShell scripts
and programs on PATH (issue #287).

Reframe these as a chat answer that requires read-only investigation
first, taught via a single worked sample rather than a rigid per-shell
rule table: look at the PATH (Get-Command), try the command's own help to
learn its usage, then answer from evidence. When nothing is found, say so
plainly and offer real near-matches from the PATH via Get-Command
-UseFuzzyMatching (reusing PowerShell's own resolver/fuzzy engine instead
of inventing spelling correction). Prompt-only change to the embedded
terminal-agent.md; seed_prompt_files migrates it to users on next start.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 16, 2026 23:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the embedded Terminal Agent system prompt to ensure that “how do I use X” / “what is X” questions trigger a quick read-only investigation of whether X exists on the user’s machine, rather than defaulting to generic guidance.

Changes:

  • Refines Chat mode guidance to allow an exception where command-specific questions require investigation before answering.
  • Adds a new “Chat answers that need investigation” section with a worked example showing how to resolve a command, query its help/usage, and answer from evidence.
  • Updates the response-format section to reiterate that local-command chat questions should be investigated first.

Comment thread tools/wta/prompts/terminal-agent.md Outdated
Comment thread tools/wta/prompts/terminal-agent.md Outdated
Comment thread tools/wta/prompts/terminal-agent.md Outdated
@autom8edIT

Copy link
Copy Markdown

Problem

Fixes #287. Asking the terminal agent "How do I use X" gave the same generic "use help / Get-Command" advice regardless of whether X actually exists on the machine — local PowerShell scripts and programs on PATH were ignored, and a non-existent command was never flagged.

Root cause: the system prompt routes "how do I use X" / "what is X" into Chat mode, which is documented as "No tool calls" → the agent answers from general knowledge and never checks the live environment.

Change

Prompt-only edit to the embedded tools/wta/prompts/terminal-agent.md (no code). These questions are reframed as a chat answer that requires a quick read-only investigation first, taught via one worked sample rather than a rigid per-shell rule table:

  1. Look at the PATH — Get-Command X — to see whether it exists and what it is.
  2. Try its own help to learn real usage (Get-Help / -? / --help), don't guess.
  3. Answer from evidence (real kind + path).

When nothing is found: say so plainly and offer real near-matches from the PATH via Get-Command -UseFuzzyMatching — reusing PowerShell's own resolver and fuzzy engine instead of inventing spelling correction.

This stays a chat answer (no JSON card) and adapts to the pane's shell. It's the embedded default, so seed_prompt_files migrates it to users' on-disk default (and unedited user prompt) on next start.

Why a sample, not a rule table

An earlier draft prescribed exact commands and strict found/not-found branches per shell — too rigid. A single canonical example teaches the shape (recognize you don't know → investigate → try usage → answer from evidence) and lets the agent generalize.

Test

  • cargo build ✅ (only pre-existing warnings)
  • cargo test ✅ 903 passed / 0 failed (incl. prompt.rs merge_runtime_sections tests; the embedded prompt is include_str!'d so it must still compile)

Just want to say I really appreciate the work you put in! Since I need Windws, Linux and macOS I need a nice terminal and conhost, even if it's a legend, lacks agent feature!

God Bless! 👌👑👑👑

The autofix prompt could correct typos it knew from training data
(`dotent`->`dotnet`) but never suggested the *local* PowerShell scripts
and programs on the user's PATH that they most likely mistyped
(`deploit`->`deploy-it`), because the LLM has no knowledge of the user's
environment.

Add `command_recall`, which grounds "did you mean" suggestions in the
user's real PATH. In the autofix prompt-assembly path (PowerShell only in
v1):

- Extract the failing command token from the captured [command + output]
  buffer (ControlCore::ReadLastPrompt starts at the FTCS command mark, so
  there is no prompt prefix to strip).
- Gate locale-independently on existence, not on the (localized) error
  text: a cheap in-process `which` pre-gate short-circuits the common
  case (failed build/test/git where the program exists) with no
  subprocess; only a genuine not-found pays one `Get-Command` enumerate.
- Rank the real command list by Damerau-Levenshtein (strsim), with an
  anagram tie-break so a transposition (`gti`->`git`) outranks an
  equidistant substitution (`gti`->`gci`). Inject the top matches as a
  `### Near Matches` section.

`auto-fix.md` is updated to consume `### Near Matches` and to state
plainly when a command isn't on PATH instead of giving generic
"check spelling / use help" advice.

strsim is promoted to a direct dependency; it was already in the
lockfile via clap, so the resolved graph (and cgmanifest/NOTICE) is
unchanged.

Known blind spot (accepted for v1): the enumerate runs `-NoProfile`, so
it sees PATH executables/scripts (the issue's concern) but not
profile-only interactive functions/aliases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vanzue vanzue changed the title Investigate a named command before answering "how do I use X" (#287) Investigate commands on PATH before answering / autofixing (#287) Jun 17, 2026
…hing

The investigation sample told the agent to spelling-correct an unknown
command with `Get-Command -UseFuzzyMatching`, but on PS 7.6.2 that
ranking buries PATH applications and external scripts (git ranked ~300th
for `gti`; `deploy-it` not returned for `deploit`) — exactly the local
scripts/programs issue #287 is about.

Replace it with a PATH-grounded approach that needs no fixed matcher:
the agent searches the real command list with a wildcard stem
(`Get-Command -Name "*depl*"` / `compgen -c | grep`) and picks the
closest itself — the LLM is a better judge of a likely typo than the
built-in fuzzy ranker. Existence is still confirmed with `Get-Command`,
so the issue's minimum ("say there is no such command") holds in any
locale.

(The strsim-backed near-match engine stays in the autofix path, where
wta assembles the prompt in-process and doesn't depend on any
agent-callable tool. There is no MCP server and `wta` isn't on PATH, so
exposing a CLI/MCP tool to the agent pane isn't viable without
disproportionate C++/packaging work.)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 17, 2026 03:31

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.

Comment thread tools/wta/src/command_recall.rs
Comment thread tools/wta/src/command_recall.rs
Comment thread tools/wta/src/command_recall.rs Outdated
Comment thread tools/wta/src/protocol/acp/client.rs
Comment thread tools/wta/prompts/auto-fix.md Outdated
vanzue and others added 2 commits June 18, 2026 10:41
…, wording

Address Copilot review feedback on PR #306:

- strip_exe_ext: guard the slice with str::get so a non-ASCII command
  name (Unicode function/alias) whose byte boundary lands mid-char no
  longer panics and crashes autofix prompt assembly.
- command_exists / rank_near_matches: strip the user-typed token's own
  executable extension too, so an explicit 'deploy-it.ps1' matches the
  stripped candidate instead of being misreported as not-found.
- enumerate_powershell_commands: include Cmdlet in the Get-Command
  existence gate so a failing cmdlet invocation (e.g. Get-Item with a
  bad path) isn't misclassified as a not-found command.
- Reword the injected '### Near Matches' section and auto-fix.md to say
  'not found as a command in this shell' instead of 'on PATH', since the
  list comes from Get-Command (aliases/functions/cmdlets), not just PATH.

Adds 3 regression unit tests (non-ASCII no-panic, token-with-extension
existence, token-extension ranking). cargo test: 917 passed / 0 failed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… help

Address remaining Copilot review feedback on PR #306:

- Chat-mode investigation: make explicit that the agent runs the
  read-only probe via its OWN tools (execute_command / view /
  read_text_file) and must NOT emit a recommendation card asking the
  user to run it. Resolves the concern that Chat mode (No JSON) had no
  stated path to actually execute the investigation.
- Usage discovery: prefer Get-Help and reading the script's param block
  (non-executing) as the source of truth; demote the command's own help
  flag (-? / --help) to a side-effect-aware last resort, since a plain
  script may run its body before any help check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 18, 2026 08:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

Comment thread tools/wta/prompts/terminal-agent.md Outdated
Comment thread tools/wta/src/protocol/acp/client.rs
Two unresolved Copilot review threads:

- client.rs: wrap each \### Near Matches\ candidate in inline-code so command names containing Markdown-significant chars (_, *) are read as literals, not formatting.

- terminal-agent.md: Get-Command resolves more than PATH (cmdlets/functions/aliases), so reword the sample to describe the command by its actual type and avoid claiming it isn't 'on PATH' when it may be a function/alias.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Doesn't know about scripts on path.

3 participants