From 7465f2b8521afd87b6a36f22723554d2e68ded1a Mon Sep 17 00:00:00 2001 From: "Kai Tao (from Dev Box)" Date: Fri, 3 Jul 2026 10:47:39 +0800 Subject: [PATCH 1/5] docs(release-check-list): sync with v0.1.2 features Reviewed the release checklist against the v0.1.2 release notes and fixed two outdated statements plus added seven missing test cases for new features. Outdated statements corrected: - Session view refresh: now F5 on-demand re-scan (incl. WSL distros started after launch), not gated on hooks (#344). - Historical state: history is now sourced from ACP session/list, not on-disk file parsing (#365). Missing test cases added: - Alt+V clipboard image paste into agent chat (#354) - GitHub Enterprise Copilot sign-in (#362) - Bash/WSL shell integration incl. set -u safety (#340) - Shells self-report identity via OSC 9001;ShellType + nested-shell autofix targeting (#345) - Environment-aware answers/fixes: investigate PATH before answering/autofixing (#306) - WSL distro sessions visible and resumable in the correct distro (#323) - Agent panes not persisted into saved window layout (#360/#275) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- doc/release-check-list.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/doc/release-check-list.md b/doc/release-check-list.md index 906826441..8210fb150 100644 --- a/doc/release-check-list.md +++ b/doc/release-check-list.md @@ -120,6 +120,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[E2E]` **Different positions work:** Open/hide/focus works for bottom, right, left, and top pane positions. - [ ] `[E2E]` **Stash preserves chat:** Hiding and restoring the pane preserves helper process, connection state, and chat history. - [ ] `[E2E]` **Tab close cleans up:** Closing the owning tab cleans up the helper and does not leave a broken pane. +- [ ] `[E2E]` **Agent panes are not persisted into saved layout:** Saving and restoring a window layout does not resurrect a previously-open agent pane; restored windows come back without an unexpected agent pane. _(#360/#275.)_ ### Built-in agent chat matrix @@ -127,6 +128,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT~]` `[E2E]` **Copilot missing CLI path works:** Missing Copilot shows actionable setup/auth guidance, not a silent failure. _(UT: registry install hint.)_ - [ ] `[E2E]` **Non-Copilot agents chat works:** Each installed+authenticated non-Copilot built-in agent (Claude/Codex/Gemini) connects through its ACP adapter and answers a prompt. _(One consolidated matrix case — all built-in agents share the same agent-pane/ACP path, so per-agent behavioural depth is covered by the Copilot suites.)_ - [ ] `[UT~]` `[E2E]` **Agent auth failure works:** Unauthenticated agents show clear login guidance and can recover after sign-in. _(UT: `AgentFailure::AuthRequired` classification; in-pane auth screen render via `render_auth_screen_shows_agent_name` / `render_auth_sign_in_card` / `render_auth_checking_with_status_message`.)_ +- [ ] `[E2E]` **GitHub Enterprise Copilot sign-in works:** On the auth screen, pressing **E** lets the user enter a GHE domain (e.g. `*.ghe.com`) and sign in; the last-used host is remembered and the device-verification URL targets that host. _(#362.)_ - [ ] `[E2E]` **Agent restart after settings change works:** Changing the selected agent or model restarts/reconnects cleanly. ### Input and rendering @@ -135,6 +137,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[E2E]` **Prompt out-of-focus appearance is correct:** Input box looks correct when focus leaves the agent pane. - [ ] `[E2E]` **Typing works:** User can type, edit, and submit prompt text correctly. - [ ] `[E2E]` **Paste works:** Pasted multi-line text is handled correctly. +- [ ] `[UT✓]` `[E2E]` **Image paste (Alt+V) works:** A copied screenshot (`CF_DIB`/`CF_DIBV5`) or image file is sent to the agent as an ACP image content block; the action is gated on the agent advertising image support and is a no-op otherwise. _(UT: `clipboard_image` + `mock_agent_tests` `seen_images` side-channel; #354.)_ - [ ] `[E2E]` **Keyboard navigation works:** Arrow keys, Tab completion, Ctrl combinations, and Esc behave correctly. - [ ] `[E2E]` `[MANUAL]` **IME/non-ASCII input works:** IME and non-ASCII input are usable if the release supports localized typing. - [ ] `[UT✓]` `[E2E]` **Streaming output renders correctly:** Agent response chunks, tool calls, plans, and status lines render without corruption. _(UT: `streaming_two_chunks_coalesce_in_app_chat`, `tool_call_surfaces_card_in_chat`, `tool_call_completion_updates_card_status` (in-place, no dup), `plan_surfaces_card_in_chat`, `render_chat_all_message_variants`; streaming-JSON unwrap incl. emoji/surrogate pairs in `ui::chat::tests`.)_ @@ -170,6 +173,8 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not ### Shell integration and detection - [ ] `[E2E]` **PowerShell shell integration installed:** Supported PowerShell profiles emit command-finished events. +- [ ] `[E2E]` **Bash / WSL shell integration installed:** Supported bash and WSL-bash profiles emit command-finished events, and the injected `PROMPT_COMMAND` is safe under `set -u` (no errors in strict-mode shells). _(#340.)_ +- [ ] `[E2E]` **Shells self-report identity (`OSC 9001;ShellType`):** The terminal knows which shell owns a pane — including after a nested shell (`pwsh` → `wsl` → `exit`) returns — so autofix suggests commands for the *current* shell (no PowerShell suggestions inside a WSL/bash pane). `wtcli list-panes` exposes the live shell + version per pane. _(#345.)_ - [ ] `[E2E]` **Missing shell integration is safe:** Without shell integration, failures do not crash or produce broken UI. - [x] `[UT✓]` **Failure detection works:** A failing command emits an event and is detected by Intelligent Terminal. _(UT: `classify_wt_event`.)_ - [x] `[UT✓]` **Successful commands ignored:** Successful commands do not trigger autofix. _(UT: `classify_wt_event` + `success_exit_code_does_not_arm_autofix`.)_ @@ -190,6 +195,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT✓]` `[E2E]` **Autofix target pane is correct:** Failure in one pane does not offer/run a fix in the wrong pane. _(UT: target-tab routing — busy-pane tests + `autofix_still_triggers_for_non_agent_pane`.)_ - [ ] `[E2E]` `[MANUAL]` **Autofix with Copilot works:** Copilot returns a useful suggestion. - [ ] `[E2E]` **Autofix with non-Copilot agents works:** Autofix produces a usable suggestion with a non-Copilot built-in agent (Claude/Codex/Gemini) and a custom ACP agent — same path as Copilot, covered once across the available agents. +- [ ] `[E2E]` `[MANUAL]` **Environment-aware answers/fixes:** For a failed or "how do I use X" prompt, the agent investigates the live environment first — checks whether the command actually exists on PATH and surfaces local scripts / near-matches for a mistyped command — instead of giving generic advice or fixing a non-existent command. _(#306.)_ ### Autofix across layout changes @@ -209,7 +215,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT✓]` `[E2E]` **Slash command works:** `/sessions` opens the session view. _(UT: `/sessions` classify.)_ - [ ] `[UT✓]` `[E2E]` **Command action works:** The `openAgentSessions` action opens the session view. _(UT: `AgentActionsParse` verifies the action parses; opening the view is E2E.)_ - [ ] `[UT✓]` `[E2E]` **Session view empty state works:** Empty/no-session state is useful and not visually broken. _(UT: `render_sessions_view_shows_footer_hint` paints the agents-view chrome/footer with an empty registry; live data still E2E.)_ -- [ ] `[E2E]` **Session view refresh works:** Newly created sessions appear without restarting Terminal when hooks are active. +- [ ] `[E2E]` **Session view refresh works:** Pressing **F5** re-scans history on demand so sessions that appeared after launch show up without restarting Terminal — including a WSL distro started after Intelligent Terminal booted or a CLI session started in another shell. Works independently of whether session hooks are active. _(#344.)_ ### Session states @@ -218,7 +224,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT✓]` `[E2E]` **Waiting-for-input state is correct:** A session waiting for user input/attention shows the waiting/attention state. _(UT: Attention activity.)_ - [ ] `[UT✓]` `[E2E]` **Idle state is correct:** A live session waiting for the next prompt shows idle/ready state. - [ ] `[UT✓]` `[E2E]` **Ended state is correct:** A session whose pane was closed becomes ended and does not stay falsely live. _(UT: PaneClosed tombstone.)_ -- [ ] `[UT✓]` `[E2E]` **Historical state is correct:** On-disk sessions show as historical when not live. +- [ ] `[UT✓]` `[E2E]` **Historical state is correct:** Historical sessions (now sourced from the agent's ACP `session/list`, not on-disk file parsing) show as historical when not live. _(#365.)_ - [ ] `[UT✓]` `[E2E]` **State transitions are correct:** Live -> ended, historical -> live, and working -> idle transitions update without duplicate/stale rows. _(UT: `apply_alive_session_join` / `apply_master_session_ended`.)_ ### Focus and restore @@ -227,6 +233,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT✓]` `[E2E]` **Focus active stashed agent pane:** Selecting an active stashed agent-pane session restores/focuses the pane if applicable. - [ ] `[UT✓]` `[E2E]` **Restore old session:** Selecting a supported old session resumes it successfully. - [ ] `[UT✓]` `[E2E]` **Restore old shell-pane session:** Supported shell-pane sessions resume through the CLI resume path. _(UT: `ResumeCliFlag` decision.)_ +- [ ] `[UT~]` `[E2E]` **WSL distro sessions are visible and resumable:** Agent-CLI sessions that were run *inside* a WSL distro appear in the session list tagged `[WSL-]`, and Enter resumes the session in that distro (via the wsl.exe ACP bridge). _(#323; UT: WSL session sourcing/classification.)_ - [ ] `[UT✓]` `[E2E]` **Restore old agent-pane session:** Supported agent-pane sessions resume through agent-pane/session-load path when enabled. _(UT: `ResumeInAgentPane` decision.)_ - [ ] `[UT✓]` `[E2E]` **Unsupported restore is clear:** Unknown CLI, missing resume support, or missing on-disk session shows a clear not-resumable message. _(UT: `NotResumable` reasons.)_ - [ ] `[UT✓]` `[E2E]` **Enter behavior works:** Enter performs the expected focus/resume action. From 62206f387be58c8248c790ba7444c918be9d13d5 Mon Sep 17 00:00:00 2001 From: "Kai Tao (from Dev Box)" Date: Fri, 3 Jul 2026 10:55:14 +0800 Subject: [PATCH 2/5] docs(release-check-list): tag new v0.1.2 cases with [new] + add 5 more MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Second review pass over the release checklist against the v0.1.2 release notes: - Added a [new] coverage marker to the legend and tagged all 12 genuinely-new v0.1.2 test cases with it (the two reworded existing cases — F5 refresh, ACP-sourced historical — are corrections, not new cases, so they stay untagged). - Added 5 more previously-missing cases surfaced by a closer read: - FRE execution-policy detection is correct / no false-block on probe timeout (#336/#338/#309) - wta-master death is a consistent degraded state requiring /restart, no split-brain (#329) - Session titles are clean (no bare '# AGENTS.md instructions' heading) (#355) - Focus brings the target window to the foreground (#353) - Agent-created terminals inherit the active pane profile (#366, closes #351) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- doc/release-check-list.md | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/doc/release-check-list.md b/doc/release-check-list.md index 8210fb150..6d69d4e6f 100644 --- a/doc/release-check-list.md +++ b/doc/release-check-list.md @@ -9,6 +9,7 @@ Use this checklist to validate and sign off an Intelligent Terminal release. Eac - `[UT~]` — partially UT-coverable: decision/logic core can be unit-tested, full behavior still needs E2E/UI. - `[E2E]` — needs mock-ACP end-to-end or UI automation; not a UT. - `[MANUAL]` — human judgment (visual polish, real LLM quality, install/auth UX). +- `[new]` — test case newly added for this release (v0.1.2); not exercised in a prior sign-off. Orthogonal to the coverage markers above — read it alongside the `[UT*]`/`[E2E]`/`[MANUAL]` marker. > **Checkbox semantics:** a ticked `- [x]` box means the item is fully verified by an automated unit test (pure `[UT✓]` items). Items tagged `[UT✓]` *and* `[E2E]`/`[MANUAL]` keep the `[UT✓]` marker to show the logic core is unit-tested, but stay unchecked because release sign-off still needs the E2E / manual portion. @@ -41,6 +42,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[E2E]` **FRE privacy / help links work:** Links open the browser and do not block completion. - [ ] `[E2E]` **FRE save progress works:** The progress UI appears while setup/install work is running and returns to a usable state. - [ ] `[UT~]` `[E2E]` **FRE error messages are actionable:** Install/auth/setup failures show a useful message instead of a silent failure or raw OS error. _(UT: WTA `classify_acp_error`.)_ +- [ ] `[new]` `[UT~]` `[E2E]` **FRE execution-policy detection is correct:** FRE flags a genuinely blocking PowerShell execution policy but does **not** false-block when a load-induced execution-policy probe merely times out; an unknown/unreadable policy is treated conservatively (as blocking). _(#336/#338/#309; UT: execution-policy gate.)_ - [ ] `[UT~]` `[E2E]` **FRE respects policy locks:** If agent, autofix, or session-management policy is locked, affected controls are disabled and explain why. _(UT: `IsAgentPolicyLocked`, Effective* gates.)_ - [ ] `[UT~]` `[MANUAL]` **FRE RTL/localized layout is usable:** Layout mirrors correctly for RTL locales and text is not clipped in localized builds. _(UT: `IsRtlLocale`.)_ @@ -120,7 +122,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[E2E]` **Different positions work:** Open/hide/focus works for bottom, right, left, and top pane positions. - [ ] `[E2E]` **Stash preserves chat:** Hiding and restoring the pane preserves helper process, connection state, and chat history. - [ ] `[E2E]` **Tab close cleans up:** Closing the owning tab cleans up the helper and does not leave a broken pane. -- [ ] `[E2E]` **Agent panes are not persisted into saved layout:** Saving and restoring a window layout does not resurrect a previously-open agent pane; restored windows come back without an unexpected agent pane. _(#360/#275.)_ +- [ ] `[new]` `[E2E]` **Agent panes are not persisted into saved layout:** Saving and restoring a window layout does not resurrect a previously-open agent pane; restored windows come back without an unexpected agent pane. _(#360/#275.)_ ### Built-in agent chat matrix @@ -128,8 +130,9 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT~]` `[E2E]` **Copilot missing CLI path works:** Missing Copilot shows actionable setup/auth guidance, not a silent failure. _(UT: registry install hint.)_ - [ ] `[E2E]` **Non-Copilot agents chat works:** Each installed+authenticated non-Copilot built-in agent (Claude/Codex/Gemini) connects through its ACP adapter and answers a prompt. _(One consolidated matrix case — all built-in agents share the same agent-pane/ACP path, so per-agent behavioural depth is covered by the Copilot suites.)_ - [ ] `[UT~]` `[E2E]` **Agent auth failure works:** Unauthenticated agents show clear login guidance and can recover after sign-in. _(UT: `AgentFailure::AuthRequired` classification; in-pane auth screen render via `render_auth_screen_shows_agent_name` / `render_auth_sign_in_card` / `render_auth_checking_with_status_message`.)_ -- [ ] `[E2E]` **GitHub Enterprise Copilot sign-in works:** On the auth screen, pressing **E** lets the user enter a GHE domain (e.g. `*.ghe.com`) and sign in; the last-used host is remembered and the device-verification URL targets that host. _(#362.)_ +- [ ] `[new]` `[E2E]` **GitHub Enterprise Copilot sign-in works:** On the auth screen, pressing **E** lets the user enter a GHE domain (e.g. `*.ghe.com`) and sign in; the last-used host is remembered and the device-verification URL targets that host. _(#362.)_ - [ ] `[E2E]` **Agent restart after settings change works:** Changing the selected agent or model restarts/reconnects cleanly. +- [ ] `[new]` `[UT~]` `[E2E]` **Master death is a consistent degraded state:** If `wta-master` exits, the agent pane shows a single consistent degraded state and requires `/restart` to recover — no silent "split-brain" where it looks half-alive. _(#329.)_ ### Input and rendering @@ -137,7 +140,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[E2E]` **Prompt out-of-focus appearance is correct:** Input box looks correct when focus leaves the agent pane. - [ ] `[E2E]` **Typing works:** User can type, edit, and submit prompt text correctly. - [ ] `[E2E]` **Paste works:** Pasted multi-line text is handled correctly. -- [ ] `[UT✓]` `[E2E]` **Image paste (Alt+V) works:** A copied screenshot (`CF_DIB`/`CF_DIBV5`) or image file is sent to the agent as an ACP image content block; the action is gated on the agent advertising image support and is a no-op otherwise. _(UT: `clipboard_image` + `mock_agent_tests` `seen_images` side-channel; #354.)_ +- [ ] `[new]` `[UT✓]` `[E2E]` **Image paste (Alt+V) works:** A copied screenshot (`CF_DIB`/`CF_DIBV5`) or image file is sent to the agent as an ACP image content block; the action is gated on the agent advertising image support and is a no-op otherwise. _(UT: `clipboard_image` + `mock_agent_tests` `seen_images` side-channel; #354.)_ - [ ] `[E2E]` **Keyboard navigation works:** Arrow keys, Tab completion, Ctrl combinations, and Esc behave correctly. - [ ] `[E2E]` `[MANUAL]` **IME/non-ASCII input works:** IME and non-ASCII input are usable if the release supports localized typing. - [ ] `[UT✓]` `[E2E]` **Streaming output renders correctly:** Agent response chunks, tool calls, plans, and status lines render without corruption. _(UT: `streaming_two_chunks_coalesce_in_app_chat`, `tool_call_surfaces_card_in_chat`, `tool_call_completion_updates_card_status` (in-place, no dup), `plan_surfaces_card_in_chat`, `render_chat_all_message_variants`; streaming-JSON unwrap incl. emoji/surrogate pairs in `ui::chat::tests`.)_ @@ -173,8 +176,8 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not ### Shell integration and detection - [ ] `[E2E]` **PowerShell shell integration installed:** Supported PowerShell profiles emit command-finished events. -- [ ] `[E2E]` **Bash / WSL shell integration installed:** Supported bash and WSL-bash profiles emit command-finished events, and the injected `PROMPT_COMMAND` is safe under `set -u` (no errors in strict-mode shells). _(#340.)_ -- [ ] `[E2E]` **Shells self-report identity (`OSC 9001;ShellType`):** The terminal knows which shell owns a pane — including after a nested shell (`pwsh` → `wsl` → `exit`) returns — so autofix suggests commands for the *current* shell (no PowerShell suggestions inside a WSL/bash pane). `wtcli list-panes` exposes the live shell + version per pane. _(#345.)_ +- [ ] `[new]` `[E2E]` **Bash / WSL shell integration installed:** Supported bash and WSL-bash profiles emit command-finished events, and the injected `PROMPT_COMMAND` is safe under `set -u` (no errors in strict-mode shells). _(#340.)_ +- [ ] `[new]` `[E2E]` **Shells self-report identity (`OSC 9001;ShellType`):** The terminal knows which shell owns a pane — including after a nested shell (`pwsh` → `wsl` → `exit`) returns — so autofix suggests commands for the *current* shell (no PowerShell suggestions inside a WSL/bash pane). `wtcli list-panes` exposes the live shell + version per pane. _(#345.)_ - [ ] `[E2E]` **Missing shell integration is safe:** Without shell integration, failures do not crash or produce broken UI. - [x] `[UT✓]` **Failure detection works:** A failing command emits an event and is detected by Intelligent Terminal. _(UT: `classify_wt_event`.)_ - [x] `[UT✓]` **Successful commands ignored:** Successful commands do not trigger autofix. _(UT: `classify_wt_event` + `success_exit_code_does_not_arm_autofix`.)_ @@ -195,7 +198,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT✓]` `[E2E]` **Autofix target pane is correct:** Failure in one pane does not offer/run a fix in the wrong pane. _(UT: target-tab routing — busy-pane tests + `autofix_still_triggers_for_non_agent_pane`.)_ - [ ] `[E2E]` `[MANUAL]` **Autofix with Copilot works:** Copilot returns a useful suggestion. - [ ] `[E2E]` **Autofix with non-Copilot agents works:** Autofix produces a usable suggestion with a non-Copilot built-in agent (Claude/Codex/Gemini) and a custom ACP agent — same path as Copilot, covered once across the available agents. -- [ ] `[E2E]` `[MANUAL]` **Environment-aware answers/fixes:** For a failed or "how do I use X" prompt, the agent investigates the live environment first — checks whether the command actually exists on PATH and surfaces local scripts / near-matches for a mistyped command — instead of giving generic advice or fixing a non-existent command. _(#306.)_ +- [ ] `[new]` `[E2E]` `[MANUAL]` **Environment-aware answers/fixes:** For a failed or "how do I use X" prompt, the agent investigates the live environment first — checks whether the command actually exists on PATH and surfaces local scripts / near-matches for a mistyped command — instead of giving generic advice or fixing a non-existent command. _(#306.)_ ### Autofix across layout changes @@ -216,6 +219,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT✓]` `[E2E]` **Command action works:** The `openAgentSessions` action opens the session view. _(UT: `AgentActionsParse` verifies the action parses; opening the view is E2E.)_ - [ ] `[UT✓]` `[E2E]` **Session view empty state works:** Empty/no-session state is useful and not visually broken. _(UT: `render_sessions_view_shows_footer_hint` paints the agents-view chrome/footer with an empty registry; live data still E2E.)_ - [ ] `[E2E]` **Session view refresh works:** Pressing **F5** re-scans history on demand so sessions that appeared after launch show up without restarting Terminal — including a WSL distro started after Intelligent Terminal booted or a CLI session started in another shell. Works independently of whether session hooks are active. _(#344.)_ +- [ ] `[new]` `[E2E]` **Session titles are clean:** Session rows show a meaningful title and never a bare "# AGENTS.md instructions" Codex heading or raw markdown artifact. _(#355.)_ ### Session states @@ -230,10 +234,11 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not ### Focus and restore - [ ] `[UT✓]` `[E2E]` **Focus active session:** Selecting an active session navigates/focuses the existing pane. _(UT: `decide_enter_action` Focus.)_ +- [ ] `[new]` `[E2E]` **Focus brings the target window forward:** Focusing a session/pane that lives in another (background) window brings that window to the foreground, not just the pane within it. _(#353.)_ - [ ] `[UT✓]` `[E2E]` **Focus active stashed agent pane:** Selecting an active stashed agent-pane session restores/focuses the pane if applicable. - [ ] `[UT✓]` `[E2E]` **Restore old session:** Selecting a supported old session resumes it successfully. - [ ] `[UT✓]` `[E2E]` **Restore old shell-pane session:** Supported shell-pane sessions resume through the CLI resume path. _(UT: `ResumeCliFlag` decision.)_ -- [ ] `[UT~]` `[E2E]` **WSL distro sessions are visible and resumable:** Agent-CLI sessions that were run *inside* a WSL distro appear in the session list tagged `[WSL-]`, and Enter resumes the session in that distro (via the wsl.exe ACP bridge). _(#323; UT: WSL session sourcing/classification.)_ +- [ ] `[new]` `[UT~]` `[E2E]` **WSL distro sessions are visible and resumable:** Agent-CLI sessions that were run *inside* a WSL distro appear in the session list tagged `[WSL-]`, and Enter resumes the session in that distro (via the wsl.exe ACP bridge). _(#323; UT: WSL session sourcing/classification.)_ - [ ] `[UT✓]` `[E2E]` **Restore old agent-pane session:** Supported agent-pane sessions resume through agent-pane/session-load path when enabled. _(UT: `ResumeInAgentPane` decision.)_ - [ ] `[UT✓]` `[E2E]` **Unsupported restore is clear:** Unknown CLI, missing resume support, or missing on-disk session shows a clear not-resumable message. _(UT: `NotResumable` reasons.)_ - [ ] `[UT✓]` `[E2E]` **Enter behavior works:** Enter performs the expected focus/resume action. @@ -295,6 +300,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT~]` `[E2E]` **Multiple tabs work:** Each tab has its own agent pane/session state. _(UT: per-tab state.)_ - [ ] `[E2E]` **Multiple agent panes work:** Opening agent panes in multiple tabs does not mix conversations. - [ ] `[E2E]` **Move tab to new window preserves chat:** Dragging/tearing a tab to another window preserves agent pane state. +- [ ] `[new]` `[E2E]` **Agent-created terminals inherit the active profile:** A terminal/tab the agent opens inherits the active pane's profile (e.g. an agent working in an Ubuntu session spawns new tabs in Ubuntu, not the default PowerShell profile). _(#366, closes #351.)_ - [ ] `[UT~]` `[E2E]` **Move tab to new window preserves session routing:** Session events remain associated with the moved tab. _(UT: tab_id routing.)_ - [ ] `[UT~]` `[E2E]` **Move tab to new window preserves autofix:** Autofix still routes to the moved tab/pane. - [ ] `[UT~]` `[E2E]` **Multiple windows do not cross-route:** Events from one window do not mutate another window's agent pane/session UI. _(UT: window_id filter.)_ From 44039b5d14aff1f2314b682fa1194c5117b8f981 Mon Sep 17 00:00:00 2001 From: "Kai Tao (from Dev Box)" Date: Fri, 3 Jul 2026 11:48:51 +0800 Subject: [PATCH 3/5] test(e2e): add master-death degraded-state integration test (#329); drop rolled-back WSL session checklist item MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds Feature.AgentMasterDeath.Tests.ps1: kills this app's wta-master out from under a live helper and asserts the full #329 contract end-to-end — the pane leaves Connected, the / popup is filtered to only /restart, NO master is silently respawned while degraded (anti-split-brain), and /restart brings up exactly one fresh master and reconnects. Verified passing live against the deployed dev package. Also removes the '[new] WSL distro sessions are visible and resumable' checklist item (#323) since WSL in-distro session support is being rolled back for v0.1.2, and drops the WSL example from the F5 refresh item to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- doc/release-check-list.md | 3 +- .../tests/Feature.AgentMasterDeath.Tests.ps1 | 109 ++++++++++++++++++ 2 files changed, 110 insertions(+), 2 deletions(-) create mode 100644 test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 diff --git a/doc/release-check-list.md b/doc/release-check-list.md index 6d69d4e6f..54ccaf764 100644 --- a/doc/release-check-list.md +++ b/doc/release-check-list.md @@ -218,7 +218,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT✓]` `[E2E]` **Slash command works:** `/sessions` opens the session view. _(UT: `/sessions` classify.)_ - [ ] `[UT✓]` `[E2E]` **Command action works:** The `openAgentSessions` action opens the session view. _(UT: `AgentActionsParse` verifies the action parses; opening the view is E2E.)_ - [ ] `[UT✓]` `[E2E]` **Session view empty state works:** Empty/no-session state is useful and not visually broken. _(UT: `render_sessions_view_shows_footer_hint` paints the agents-view chrome/footer with an empty registry; live data still E2E.)_ -- [ ] `[E2E]` **Session view refresh works:** Pressing **F5** re-scans history on demand so sessions that appeared after launch show up without restarting Terminal — including a WSL distro started after Intelligent Terminal booted or a CLI session started in another shell. Works independently of whether session hooks are active. _(#344.)_ +- [ ] `[E2E]` **Session view refresh works:** Pressing **F5** re-scans history on demand so sessions that appeared after launch show up without restarting Terminal — e.g. a CLI session started in another shell. Works independently of whether session hooks are active. _(#344.)_ - [ ] `[new]` `[E2E]` **Session titles are clean:** Session rows show a meaningful title and never a bare "# AGENTS.md instructions" Codex heading or raw markdown artifact. _(#355.)_ ### Session states @@ -238,7 +238,6 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[UT✓]` `[E2E]` **Focus active stashed agent pane:** Selecting an active stashed agent-pane session restores/focuses the pane if applicable. - [ ] `[UT✓]` `[E2E]` **Restore old session:** Selecting a supported old session resumes it successfully. - [ ] `[UT✓]` `[E2E]` **Restore old shell-pane session:** Supported shell-pane sessions resume through the CLI resume path. _(UT: `ResumeCliFlag` decision.)_ -- [ ] `[new]` `[UT~]` `[E2E]` **WSL distro sessions are visible and resumable:** Agent-CLI sessions that were run *inside* a WSL distro appear in the session list tagged `[WSL-]`, and Enter resumes the session in that distro (via the wsl.exe ACP bridge). _(#323; UT: WSL session sourcing/classification.)_ - [ ] `[UT✓]` `[E2E]` **Restore old agent-pane session:** Supported agent-pane sessions resume through agent-pane/session-load path when enabled. _(UT: `ResumeInAgentPane` decision.)_ - [ ] `[UT✓]` `[E2E]` **Unsupported restore is clear:** Unknown CLI, missing resume support, or missing on-disk session shows a clear not-resumable message. _(UT: `NotResumable` reasons.)_ - [ ] `[UT✓]` `[E2E]` **Enter behavior works:** Enter performs the expected focus/resume action. diff --git a/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 b/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 new file mode 100644 index 000000000..a638e6faf --- /dev/null +++ b/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 @@ -0,0 +1,109 @@ +#Requires -Modules @{ ModuleName='Pester'; ModuleVersion='5.0.0' } +# Release checklist §2 "Master death is a consistent degraded state" (#329): +# If wta-master exits, the agent pane shows a single consistent degraded state and requires +# /restart to recover — no silent "split-brain" where one pane silently gets a fresh master +# while orphaned panes stay dead. +# +# This suite exercises the REAL failure by killing wta-master out from under a live helper and +# asserting the whole contract end-to-end (not just the Rust unit-tested latch): +# 1. the pane leaves Connected (the degraded state is entered), +# 2. the `/` popup is filtered to ONLY /restart (the locale-independent degraded signal), +# 3. NO master is silently respawned while degraded (the anti-split-brain guarantee), +# 4. /restart brings up exactly one fresh master and the pane reconnects. +# +# Architecture facts this relies on (verified live + tools/wta/AGENTS.md): +# * master = `wta.exe --master ` (spawned once by C++ SharedWta) +# * helper = `wta.exe --connect-master ` (one per agent pane) +# * both are children of the WindowsTerminal.exe process (this app's Pid), so we kill ONLY +# this app's master and never another instance's. +# * the helper detects master death proactively (client.rs: "pipe closed (master gone)" -> +# AgentFailure::TransportLost -> connection.lost), so no prompt is needed to trigger it. +# +# Invoke-Pester test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 -Tag Feature + +BeforeDiscovery { $script:Ready = [bool]((Get-AppxPackage | Where-Object { $_.Name -like '*IntelligentTerminal*' }) -and (Get-Command copilot -ErrorAction SilentlyContinue) -and (Get-Command winapp -ErrorAction SilentlyContinue)) } + +Describe 'Feature §2 master death is a consistent degraded state (#329)' -Tag 'Feature' -Skip:(-not $script:Ready) { + BeforeAll { + Import-Module (Join-Path $PSScriptRoot '..\ItE2E\ItE2E.psd1') -Force + $script:app = Start-Terminal -Package (Get-ItTestPackage) -PassFre $true -Settings @{ acpAgent = 'copilot' } + Open-AgentPane -App $script:app | Out-Null + Wait-AgentReady -App $script:app -TimeoutSec 90 | Should -BeTrue -Because 'the agent pane must be connected before we can test losing the connection' + + # This app's wta-master(s): a wta.exe child of THIS WindowsTerminal.exe launched with + # `--master` (not `--connect-master`). Scoping to the app's Pid guarantees we never touch + # another instance's master. + $script:GetMasters = { + @(Get-CimInstance Win32_Process -Filter "Name='wta.exe'" -ErrorAction SilentlyContinue | + Where-Object { + $_.ParentProcessId -eq $script:app.Pid -and + $_.CommandLine -match '--master(\s|$|")' -and + $_.CommandLine -notmatch '--connect-master' + }) + } + # The degraded hint reuses the localized `connection.lost` string; match it across every + # bundled locale so the assertion is locale-robust (the running build renders one locale). + $script:LostRe = Get-WtaLocalizedTextRegex -Key 'connection.lost' + if (-not $script:LostRe) { $script:LostRe = '(?i)connection.*lost|/restart' } + } + AfterAll { if ($script:app) { Stop-Terminal -App $script:app } } + + It 'killing wta-master enters a degraded state, blocks all slash commands but /restart, does not silently respawn, and recovers via /restart' { + # --- there is exactly one live master while connected --- + $masters = & $script:GetMasters + $masters.Count | Should -BeGreaterThan 0 -Because 'a connected agent pane implies a running wta-master' + $killedPids = @($masters.ProcessId) + + # --- kill the master out from under the live helper --- + foreach ($mp in $killedPids) { Stop-Process -Id $mp -Force -ErrorAction SilentlyContinue } + Wait-Until -TimeoutSec 15 -Because 'the killed wta-master process(es) to be gone' -Condition { + @(& $script:GetMasters | Where-Object { $killedPids -contains $_.ProcessId }).Count -eq 0 + } | Out-Null + + # --- (1) the pane leaves Connected: the connected input placeholder disappears --- + # Wait-AgentReady returns false once the pane is no longer user-visibly connected. A short + # timeout is enough because the helper detects the dead pipe proactively. + $stillReady = Wait-AgentReady -App $script:app -TimeoutSec 20 + $stillReady | Should -BeFalse -Because 'after master dies the pane must drop out of the Connected state, not appear half-alive' + + # --- degraded hint is surfaced somewhere in the pane (best-effort, locale-robust) --- + Test-AgentPopupShown -App $script:app -Pattern $script:LostRe -TimeoutSec 20 | + Should -BeTrue -Because 'the degraded pane must tell the user the connection was lost and to /restart' + + # --- (2) the `/` popup is filtered to ONLY /restart --- + # Type '/' and read the rendered TUI menu. In transport_lost, command_popup_state() is + # filtered to /restart only (slash_command_tests.rs), so no other command may appear. + Clear-AgentInput -App $script:app | Out-Null + Send-AgentPrompt -App $script:app -Text '/' -NoSubmit | Out-Null + Wait-Until -TimeoutSec 12 -Because 'the degraded /restart popup to render' -Condition { + (Get-AgentPaneText -App $script:app -MaxLines 40) -match '/restart' + } | Out-Null + $menu = Get-AgentPaneText -App $script:app -MaxLines 40 + $menu | Should -Match '/restart' -Because 'the degraded popup must still offer /restart (the one recovery command)' + # No other slash command may be offered while degraded — each would hit the dead pipe. + foreach ($blocked in '/help', '/clear', '/new', '/sessions', '/model', '/fix') { + $menu | Should -Not -Match ([regex]::Escape($blocked)) -Because "while degraded the popup must hide $blocked (only /restart runs)" + } + + # --- (3) NO silent respawn: while degraded, master count must stay 0 --- + # PR #329's core anti-split-brain guarantee: opening/using a degraded pane must NOT lazily + # respawn a fresh master. Poll for a few seconds to catch a late respawn. + for ($i = 0; $i -lt 6; $i++) { + @(& $script:GetMasters).Count | Should -Be 0 -Because 'no wta-master may be silently respawned while the stack is degraded (that would be split-brain)' + Start-Sleep -Milliseconds 500 + } + + # --- (4) /restart recovers: one fresh master, pane reconnects --- + Clear-AgentInput -App $script:app | Out-Null + Send-AgentPrompt -App $script:app -Text '/restart' | Out-Null # type it and submit + Wait-AgentReady -App $script:app -TimeoutSec 90 | Should -BeTrue -Because '/restart must respawn the stack and reconnect the pane' + + $recovered = & $script:GetMasters + $recovered.Count | Should -Be 1 -Because '/restart must bring up exactly one master (not zero, not a split-brain duplicate)' + ($killedPids -contains $recovered[0].ProcessId) | Should -BeFalse -Because 'the recovered master must be a genuinely fresh process, not the killed one' + + # The reconnected pane must actually work again. + Send-AgentPrompt -App $script:app -Text 'What is 7 plus 7? Reply with only the number.' | Out-Null + Assert-AgentPaneText -App $script:app -Pattern '\b14\b' -TimeoutSec 50 + } +} From bca88d67656942619b24cc78d6f5c73afd299b5e Mon Sep 17 00:00:00 2001 From: "Kai Tao (from Dev Box)" Date: Fri, 3 Jul 2026 12:50:13 +0800 Subject: [PATCH 4/5] Address Copilot review: tighten master-count assertion and clarify Alt+V behavior MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Feature.AgentMasterDeath.Tests.ps1: assert exactly one wta-master while connected (-Be 1, was -BeGreaterThan 0) so the connected-state gate itself catches a split-brain regression. Re-verified passing live. - release-check-list.md: correct the Alt+V item — Alt+V queues the image until the next prompt, and an unsupported agent / empty clipboard surfaces a clear system message rather than being a silent no-op. The [WSL-] raw-angle-bracket comment is already resolved: that checklist item was removed with the WSL session rollback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- doc/release-check-list.md | 2 +- test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 | 6 +++++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/doc/release-check-list.md b/doc/release-check-list.md index 54ccaf764..cdd726647 100644 --- a/doc/release-check-list.md +++ b/doc/release-check-list.md @@ -140,7 +140,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `[E2E]` **Prompt out-of-focus appearance is correct:** Input box looks correct when focus leaves the agent pane. - [ ] `[E2E]` **Typing works:** User can type, edit, and submit prompt text correctly. - [ ] `[E2E]` **Paste works:** Pasted multi-line text is handled correctly. -- [ ] `[new]` `[UT✓]` `[E2E]` **Image paste (Alt+V) works:** A copied screenshot (`CF_DIB`/`CF_DIBV5`) or image file is sent to the agent as an ACP image content block; the action is gated on the agent advertising image support and is a no-op otherwise. _(UT: `clipboard_image` + `mock_agent_tests` `seen_images` side-channel; #354.)_ +- [ ] `[new]` `[UT✓]` `[E2E]` **Image paste (Alt+V) works:** A copied screenshot (`CF_DIB`/`CF_DIBV5`) or image file is queued and sent to the agent as an ACP image content block on the next prompt; the action is gated on the agent advertising image support. When the agent does not support images, or the clipboard has no image, it does not paste but surfaces a clear system message (e.g. "image not supported" / "clipboard empty") rather than silently ignoring the keypress. _(UT: `clipboard_image` + `mock_agent_tests` `seen_images` side-channel; #354.)_ - [ ] `[E2E]` **Keyboard navigation works:** Arrow keys, Tab completion, Ctrl combinations, and Esc behave correctly. - [ ] `[E2E]` `[MANUAL]` **IME/non-ASCII input works:** IME and non-ASCII input are usable if the release supports localized typing. - [ ] `[UT✓]` `[E2E]` **Streaming output renders correctly:** Agent response chunks, tool calls, plans, and status lines render without corruption. _(UT: `streaming_two_chunks_coalesce_in_app_chat`, `tool_call_surfaces_card_in_chat`, `tool_call_completion_updates_card_status` (in-place, no dup), `plan_surfaces_card_in_chat`, `render_chat_all_message_variants`; streaming-JSON unwrap incl. emoji/surrogate pairs in `ui::chat::tests`.)_ diff --git a/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 b/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 index a638e6faf..381521cce 100644 --- a/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 +++ b/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 @@ -51,7 +51,11 @@ Describe 'Feature §2 master death is a consistent degraded state (#329)' -Tag ' It 'killing wta-master enters a degraded state, blocks all slash commands but /restart, does not silently respawn, and recovers via /restart' { # --- there is exactly one live master while connected --- $masters = & $script:GetMasters - $masters.Count | Should -BeGreaterThan 0 -Because 'a connected agent pane implies a running wta-master' + # A connected agent pane runs EXACTLY ONE wta-master per WindowsTerminal process — the + # C++ SharedWta singleton owns a single master and fans every pane onto it. Asserting + # `-Be 1` (not just `> 0`) makes this gate itself catch a split-brain regression where a + # second master already exists while connected. + $masters.Count | Should -Be 1 -Because 'a connected agent pane implies exactly one shared wta-master (SharedWta is a singleton)' $killedPids = @($masters.ProcessId) # --- kill the master out from under the live helper --- From 88f6d64366529deb31523477cba4aff9e75c4bed Mon Sep 17 00:00:00 2001 From: "Kai Tao (from Dev Box)" Date: Fri, 3 Jul 2026 13:13:57 +0800 Subject: [PATCH 5/5] Address Copilot review round 2: block /stop in degraded popup test, release-agnostic [new] marker, spelling fix - Feature.AgentMasterDeath.Tests.ps1: add /stop to the blocked-slash-command list so a regression that offers /stop while transport-lost is caught (it's a real CommandKind::Stop). Re-verified passing live. - release-check-list.md: reword the [new] marker in release-agnostic terms and say when to clear it (was hard-coded to v0.1.2); fix forbidden spelling non-existent -> nonexistent (check-spelling CI). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- doc/release-check-list.md | 4 ++-- test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/release-check-list.md b/doc/release-check-list.md index 3cd7f41bb..30491c395 100644 --- a/doc/release-check-list.md +++ b/doc/release-check-list.md @@ -9,7 +9,7 @@ Use this checklist to validate and sign off an Intelligent Terminal release. Eac - `[UT~]` — partially UT-coverable: decision/logic core can be unit-tested, full behavior still needs E2E/UI. - `[E2E]` — needs mock-ACP end-to-end or UI automation; not a UT. - `[MANUAL]` — human judgment (visual polish, real LLM quality, install/auth UX). -- `[new]` — test case newly added for this release (v0.1.2); not exercised in a prior sign-off. Orthogonal to the coverage markers above — read it alongside the `[UT*]`/`[E2E]`/`[MANUAL]` marker. +- `[new]` — test case added for the current release cycle and not yet exercised in a prior sign-off. Orthogonal to the coverage markers above — read it alongside the `[UT*]`/`[E2E]`/`[MANUAL]` marker. Clear the `[new]` tag once the item has been through a release sign-off (it then becomes an ordinary tracked item). > **Checkbox semantics:** a ticked `- [x]` box means the item is fully verified by an automated unit test (pure `[UT✓]` items). Items tagged `[UT✓]` *and* `[E2E]`/`[MANUAL]` keep the `[UT✓]` marker to show the logic core is unit-tested, but stay unchecked because release sign-off still needs the E2E / manual portion. @@ -200,7 +200,7 @@ Net effect: UT shrinks the manual matrix to "did the wiring and UI connect", not - [ ] `C102` `[UT✓]` `[E2E]` **Autofix target pane is correct:** Failure in one pane does not offer/run a fix in the wrong pane. _(UT: target-tab routing — busy-pane tests + `autofix_still_triggers_for_non_agent_pane`.)_ - [ ] `C103` `[E2E]` `[MANUAL]` **Autofix with Copilot works:** Copilot returns a useful suggestion. - [ ] `C104` `[E2E]` **Autofix with non-Copilot agents works:** Autofix produces a usable suggestion with a non-Copilot built-in agent (Claude/Codex/Gemini) and a custom ACP agent — same path as Copilot, covered once across the available agents. -- [ ] `C221` `[new]` `[E2E]` `[MANUAL]` **Environment-aware answers/fixes:** For a failed or "how do I use X" prompt, the agent investigates the live environment first — checks whether the command actually exists on PATH and surfaces local scripts / near-matches for a mistyped command — instead of giving generic advice or fixing a non-existent command. _(#306.)_ +- [ ] `C221` `[new]` `[E2E]` `[MANUAL]` **Environment-aware answers/fixes:** For a failed or "how do I use X" prompt, the agent investigates the live environment first — checks whether the command actually exists on PATH and surfaces local scripts / near-matches for a mistyped command — instead of giving generic advice or fixing a nonexistent command. _(#306.)_ ### Autofix across layout changes diff --git a/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 b/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 index 381521cce..da31c59c3 100644 --- a/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 +++ b/test/e2e/tests/Feature.AgentMasterDeath.Tests.ps1 @@ -85,7 +85,7 @@ Describe 'Feature §2 master death is a consistent degraded state (#329)' -Tag ' $menu = Get-AgentPaneText -App $script:app -MaxLines 40 $menu | Should -Match '/restart' -Because 'the degraded popup must still offer /restart (the one recovery command)' # No other slash command may be offered while degraded — each would hit the dead pipe. - foreach ($blocked in '/help', '/clear', '/new', '/sessions', '/model', '/fix') { + foreach ($blocked in '/help', '/clear', '/new', '/sessions', '/model', '/fix', '/stop') { $menu | Should -Not -Match ([regex]::Escape($blocked)) -Because "while degraded the popup must hide $blocked (only /restart runs)" }