feat(windows): add temporary launch support for cc-switch start claude/codex#135
Open
AloneAtWar wants to merge 16 commits intoSaladDay:mainfrom
Open
feat(windows): add temporary launch support for cc-switch start claude/codex#135AloneAtWar wants to merge 16 commits intoSaladDay:mainfrom
AloneAtWar wants to merge 16 commits intoSaladDay:mainfrom
Conversation
CreateProcessW does not search PATH when lpApplicationName is non-NULL, so launching codex.cmd through a relative `cmd.exe` failed for shim-installed CLIs. Mirror the Claude branch by passing NULL for the application name on the .cmd/.bat path and only passing the resolved binary path for the direct-binary case.
Previously every native arg ending with `\` was rejected on the cmd.exe /c shim path, blocking benign Windows paths like `C:\work\` or `--project-dir=C:\tmp\`. A trailing `\` only escapes a closing `"`, so the hazard is real only when the arg also forces cmd quoting. Extract `is_cmd_shim` (case-insensitive `.cmd`/`.bat`) and `arg_requires_cmd_quote` helpers in both claude and codex temp_launch.rs. Use the helper for application_name selection, and reject trailing `\` only when an arg also requires cmd quoting. Add 12 Windows-only parity tests covering helper behavior, the plain-trailing-backslash accept path, the unsafe-quote+trailing-backslash reject path, and direct-binary passthrough.
Drop the now-unused `pub(crate) fn build_command_windows` from
claude_temp_launch.rs. The live Windows path goes through
`build_windows_cmdline` + `is_cmd_shim`, so the case-sensitive
`ends_with(".cmd")` helper was just a parity hazard for future readers.
Gate `use std::time::{SystemTime, UNIX_EPOCH};` with `#[cfg(not(windows))]`
in both temp_launch.rs files. On Windows the timestamp comes from
`current_process_creation_time_nanos()`, so the imports were unused.
On Windows the temp filename used `process_creation_time` as the timestamp
component, which is constant for the same process. Same-provider launches
within one cc-switch process therefore stably collided on the temp file or
codex_home directory name.
Insert an 8-hex `LAUNCH_SEQ` atomic counter between provider and pid in the
filename / dirname:
cc-switch-claude-{provider}-{seq}-{pid}-{timestamp}.json
cc-switch-codex-{provider}-{seq}-{pid}-{timestamp}
Pid and timestamp remain the last two `-`-separated segments, so the
existing `orphan_scan::parse_cc_switch_name` parser keeps working without
changes.
Add tests:
- `write_temp_settings_file_uses_unique_filename_per_call` (claude) verifies
two consecutive calls produce different paths.
- `parse_*_with_launch_seq_segment` (orphan_scan) verify the parser still
extracts pid + nanos from the longer format.
If `Job::create_with_kill_on_close()` failed, the suspended child process was leaking (handles + zombie suspended process). Mirror the defensive cleanup pattern already used in claude_temp_launch.rs: TerminateProcess + CloseHandle on both handles before returning the error.
… path When `ResumeThread` failed, `job.terminate()` was used to kill the suspended child. But `try_assign` earlier only warns-and-continues on failure; if the process never made it into the job, `TerminateJobObject` would do nothing and the suspended child would leak. Replace with explicit `TerminateProcess(h_process, 1)` so the cleanup is correct in both single- and double-failure paths, matching the pattern in codex_temp_launch.rs. Also drop the now-unused `Job::terminate` helper.
Extract duplicated Windows logic from claude_temp_launch.rs and codex_temp_launch.rs into a new windows_temp_launch.rs module: - is_cmd_shim, arg_requires_cmd_quote - quote_windows_arg, quote_windows_arg_for_cmd - build_windows_command_line, build_env_block_with_override - ScopedConsoleCtrlHandler, Job - spawn_suspended_createprocessw, wait_for_child - restrict_to_owner, create_secret_temp_file Also add AppError constructors for Windows Job Object failures. Add an automated Windows smoke test covering: - spawn suspended child via CreateProcessW - Job Object creation and assignment - ResumeThread + wait + exit code propagation Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Sort env block alphabetically in build_env_block_with_override per CreateProcessW docs requirement. - Add validate_cmd_arg helper with visible stderr warnings for %/! and hard rejections for quotes and unsafe trailing backslashes. Validate both user native args and internally-constructed args (executable path, settings path, codex_home) in cmd shim mode. - Extract shared run_suspended_child helper to eliminate drift between claude and codex Windows exec paths. - Implement atomic file/dir creation with owner-only ACL via CreateFileW/CreateDirectoryW + SECURITY_DESCRIPTOR, eliminating the TOCTOU window identified by codex review. - Add Win32_Storage_FileSystem feature to windows-sys for CreateFileW/CreateDirectoryW. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Make cmd.exe % and ! expansion hard errors instead of warnings. Adding CmdArgError::Percent and CmdArgError::Exclamation variants; validate_cmd_arg now rejects these characters to prevent real command-injection paths through cmd.exe /c (reproduced by codex). - Set SE_DACL_PROTECTED on security descriptors created by create_secret_file_with_acl and create_secret_dir_with_acl. This blocks inheritable ACEs from the parent directory, eliminating the TOCTOU window where inherited permissions could read secret temp files before restrict_to_owner was called. - Add automated test create_secret_file_with_acl_has_protected_dacl that reads the security descriptor back and verifies the DACL is protected and present. - Add automated test validate_cmd_arg_rejects_percent_and_exclamation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous commit removed OpenOptions from the top-level imports, breaking Unix compilation because create_secret_temp_file on Unix still uses OpenOptions::new(). Gate the import behind #[cfg(unix)] to avoid Windows unused-import warnings while keeping Unix builds working. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When launching .cmd/.bat shims, both Claude and Codex wrappers were
passing unqualified 'cmd.exe' as lpApplicationName (or NULL), which
lets CreateProcessW search the current directory first. A rogue cmd.exe
in the workspace could be executed instead of the system binary.
Add resolve_system_cmd_exe() helper that uses which::which('cmd.exe')
with a ComSpec fallback, and pass the absolute path as
lpApplicationName while keeping 'cmd.exe' in the command line string
so build_windows_command_line still recognizes it for proper quoting.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
which::which and ComSpec are both environment-influenceable, so a hijacked PATH or ComSpec could still redirect .cmd/.bat launches to a rogue binary. Use GetSystemDirectoryW to ask the OS directly for the system directory, then append cmd.exe. This is the trusted path. Also avoid unconditionally resolving cmd.exe for direct .exe launches in the Codex wrapper; only resolve it when is_cmd_shim is true. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… block CreateProcessW docs require callers to explicitly preserve =X: per-drive current-directory entries when supplying a custom env block. Update build_env_block_with_override to separate drive vars from regular vars, keep drive vars in original order, sort regular vars alphabetically, and place drive vars first in the output block. Add automated test verifying sorting, override replacement, and double-null termination. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AssignProcessToJobObject can fail with ERROR_ACCESS_DENIED when the parent is already inside a job that prohibits nesting. This is an expected graceful degradation, but the previous code used log::warn! which is invisible at the default error log level. - Check the raw OS error code: ACCESS_DENIED → visible eprintln! warning so users know KILL_ON_JOB_CLOSE was lost. - Any other error code → unexpected failure: terminate the child, clean up handles, and return a hard AppError. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Write .child-meta sidecar with actual child PID and creation time so
orphan_scan judges by child alive state instead of launcher PID.
This prevents nested-job fallback from deleting a still-running
CODEX_HOME when the launcher dies first. [windows_temp_launch.rs]
- Add Linux /proc/{pid}/stat starttime validation to detect PID reuse
in orphan_scan Unix branch. [orphan_scan.rs]
- Fix windows-start-qa.ps1 M2 to recursively detect descendants (e.g.
node.exe from npm .cmd shims) via CIM instead of Get-Process -Name.
- Also reap orphaned .child-meta.tmp crash residuals.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #134
Summary
This PR adds full Windows support for
cc-switch start claudeandcc-switch start codex, implementing the complete lifecycle from process spawning through secure temp file creation to reliable child-process cleanup. It also hardens the cross-platform orphan scan so it can accurately distinguish living children from stale temp entries even under PID reuse.What Changed
1. Core Windows Temp Launch Module (
windows_temp_launch.rs)Extracted duplicated Windows logic from both Claude and Codex temp launch paths into a shared module. Key responsibilities:
KILL_ON_JOB_CLOSE. If the child is successfully assigned, the OS guarantees termination when the launcher exits. If assignment fails withERROR_ACCESS_DENIED(parent already in a nested job), we degrade gracefully with a visible warning and continue—relying on orphan scan for eventual cleanup.cc-switch startbehaves like a transparent wrapper.2.
.cmd/.batShim Handlingnpm-installed CLIs (e.g.
claude,codex) are.cmdshims that must be launched viacmd.exe /c. This path is inherently more complex than direct binary execution:.cmd/.battargets,lpApplicationNamemust beNULLso CreateProcessW searches PATH; for direct.exetargets, we pass the resolved binary path.cmd.exeis resolved viaGetSystemDirectoryW(trusted OS API) rather thanwhich::whichorComSpec, both of which are influenceable by environment variables.build_windows_command_linerecognizes thecmd.exe /cprefix and applies cmd-specific quoting rules (e.g. doubling internal quotes). User native args are validated: trailing backslashes combined with quotes are rejected because they escape the closing quote in cmd.exe; plain trailing backslashes (e.g.C:\work\) are allowed when no cmd quoting is required.%and!are rejected entirely because cmd.exe expands them as environment variables and delayed expansion tokens, creating command-injection paths.3. Security Hardening
create_secret_temp_fileandcreate_secret_dir_with_acluseCreateFileW/CreateDirectoryWwith an explicitSECURITY_DESCRIPTORthat sets a DACL grantingGENERIC_ALLonly to the current user.SE_DACL_PROTECTEDis also set to block inheritable ACEs from the parent directory, eliminating a TOCTOU window.codex_home) are validated before reaching cmd.exe.4. Orphan Scan Overhaul (
orphan_scan.rs)The orphan scan is responsible for cleaning up temp files/directories left behind by crashed or force-killed launches. Three major improvements:
.child-metasidecar file is written atomically (tmp+rename) containing{child_pid}:{creation_time_nanos}. The scanner now checks the sidecar first; if present, it verifies whether the child process is alive usingOpenProcess+GetProcessTimescreation-time comparison. This fixes the nested-job fallback scenario where the launcher dies but the child survives—previously the scanner would see the dead launcher PID and delete the still-in-useCODEX_HOME.kill(pid, 0)alone cannot distinguish a reused PID. We now read/proc/{pid}/statfield 22 (starttime) and/proc/statbtime to compute the absolute process start time in nanoseconds. If the on-disk start time is more than 2 seconds later than the file's recorded nanos, the PID is considered reused and the entry is cleaned..child-metafiles whose main temp entry no longer exists, bounding long-term accumulation..child-meta.tmpcrash residuals are also cleaned.5. Environment Block Handling
Windows
CreateProcessWrequires per-drive current-directory variables (=C:,=D:, etc.) to appear first in a custom environment block, followed by regular variables in alphabetical order.build_env_block_with_overridenow separates drive vars from regular vars, preserves drive vars in original order, sorts regular vars alphabetically, and concatenates them with correct double-null termination.6. Filename Collision Avoidance
The original temp filename used
process_creation_timeas the timestamp, which is constant within a single cc-switch process. Same-provider launches therefore collided. An atomic 8-hexLAUNCH_SEQcounter is now inserted into the filename, producing unique paths for every launch while keeping the existing parser compatible.Decisions
Testing
orphan_scancovering filename parsing, sidecar logic, PID reuse detection, and atomic write behavior.scripts/windows-start-qa.ps1) covering M1–M5 scenarios: normal start, parent taskkill, orphaned file cleanup, Ctrl+C via .cmd shim, and nested Job Object fallback.cargo test orphan_scan --lib,cargo test temp_launch --lib, andcargo test windows_smoke_test_spawn_job_wait_exit_code --liball pass on Windows.Verification Notes
#[cfg(windows)]or in new Windows-only modules.概要
本 PR 为
cc-switch start claude和cc-switch start codex添加了完整的 Windows 支持,实现了从进程创建、安全临时文件管理到可靠子进程清理的完整生命周期。同时加固了跨平台的 orphan scan,使其即使在 PID 复用场景下也能准确区分存活子进程与已失效的临时条目。改动总览
1. Windows 临时启动核心模块 (
windows_temp_launch.rs)将 Claude 和 Codex 临时启动路径中重复的 Windows 逻辑提取到共享模块。核心职责:
KILL_ON_JOB_CLOSE的 Job Object。子进程成功加入后,操作系统保证启动器退出时自动终止子进程。若因父进程已在嵌套 Job 中导致ERROR_ACCESS_DENIED,则优雅降级并给出可见警告,后续依赖 orphan scan 清理。cc-switch start表现为透明包装器。2.
.cmd/.batShim 处理npm 安装的 CLI(如
claude、codex)是.cmdshim,必须通过cmd.exe /c启动。此路径比直接执行二进制更复杂:.cmd/.bat目标,lpApplicationName必须为NULL,使 CreateProcessW 搜索 PATH;对于直接.exe目标,则传入解析后的二进制路径。GetSystemDirectoryW(受信任的操作系统 API)解析cmd.exe,而非可通过环境变量影响的which::which或ComSpec。build_windows_command_line识别cmd.exe /c前缀并应用 cmd 特定的引号规则(如内部引号加倍)。用户原生参数经过校验:带引号的尾部反斜杠会被拒绝,因为它们在 cmd.exe 中转义关闭引号;纯尾部反斜杠(如C:\work\)在无 cmd 引号要求时允许通过。%和!,因为 cmd.exe 将它们作为环境变量和延迟扩展令牌展开,形成命令注入路径。3. 安全加固
create_secret_temp_file和create_secret_dir_with_acl使用带有显式SECURITY_DESCRIPTOR的CreateFileW/CreateDirectoryW,设置仅授予当前用户GENERIC_ALL的 DACL。同时设置SE_DACL_PROTECTED以阻止父目录的可继承 ACE,消除 TOCTOU 窗口。codex_home)在到达 cmd.exe 之前均经过校验。4. Orphan Scan 重构 (
orphan_scan.rs)Orphan scan 负责清理崩溃或被强制杀死的启动所遗留的临时文件/目录。三大改进:
.child-metasidecar 文件,包含{child_pid}:{creation_time_nanos}。扫描器优先检查 sidecar;若存在,则使用OpenProcess+GetProcessTimes创建时间对比来验证子进程是否存活。这修复了嵌套 Job 降级场景:启动器已死但子进程仍存活——之前的扫描器看到死亡的启动器 PID 就会删除仍在使用的CODEX_HOME。kill(pid, 0)无法区分复用的 PID。我们现在读取/proc/{pid}/stat第 22 字段(starttime)和/proc/stat的 btime,计算绝对进程启动时间(纳秒)。如果磁盘上的启动时间比文件记录的时间晚超过 2 秒,则认为 PID 已被复用并清理该条目。.child-meta文件,限制长期累积。.child-meta.tmp崩溃残留也会被清理。5. 环境块处理
Windows
CreateProcessW要求自定义环境块中每个驱动器的当前目录变量(=C:、=D:等)必须排在最前面,随后是按字母顺序排列的普通变量。build_env_block_with_override现在将驱动器变量与普通变量分离,保持驱动器变量原始顺序,按字母顺序排序普通变量,并以正确的双空字符终止拼接。6. 文件名冲突避免
原始临时文件名使用
process_creation_time作为时间戳,在同一 cc-switch 进程内保持不变,导致同一 provider 的多次启动文件名冲突。现在将原子 8 十六进制LAUNCH_SEQ计数器插入文件名,每次启动生成唯一路径,同时保持现有解析器兼容。关键决策
测试
orphan_scan中 18 个单元测试,覆盖文件名解析、sidecar 逻辑、PID 复用检测和原子写入行为。scripts/windows-start-qa.ps1)覆盖 M1–M5 场景:正常启动、父进程 taskkill、孤儿文件清理、通过 .cmd shim 的 Ctrl+C,以及嵌套 Job Object 降级。cargo test orphan_scan --lib、cargo test temp_launch --lib和cargo test windows_smoke_test_spawn_job_wait_exit_code --lib均在 Windows 上通过。验证说明
#[cfg(windows)]之后或全新 Windows 专用模块中。