fix(docker): resolve docker by absolute path on macOS spawn; honest docker_available (#696)#697
Merged
Merged
Conversation
…ocker_available (#696) On macOS, Docker Desktop installed the default way (without the optional, admin-gated "install CLI tools" step) leaves the docker CLI only inside the app bundle at /Applications/Docker.app/Contents/Resources/bin/docker, which is not on any standard PATH dir. mcpproxy spawned isolated servers with bare `docker` on a sanitized PATH that omits the bundle dir, so docker was unresolvable at spawn (~20x `command not found: docker`), while `upstream_servers list` misreported `docker_available: true`. - Spawn (root cause): setupDockerIsolation resolves docker to an absolute path via shellwrap.ResolveDockerPath (mirroring newDockerCmd) before shell-wrapping; falls back to bare "docker" only when resolution fails. - Enhanced spawn PATH (defense in depth): add the Docker Desktop bundle bin dir on darwin; split the candidate list into a pure, unit-testable function. - Honest availability: MCPProxyServer.resolveDockerStatus and Runtime.IsDockerAvailable report available only when the CLI is resolvable AND `docker info` succeeds — no bare-docker fallback that would misreport. Surface the resolved binary as docker_status.docker_path; diagnostics.checkDockerDaemon reports unavailable with an actionable error when unresolvable. - Diagnostics: formatRecentStderr leads a single actionable "Docker CLI not found" hint on `command not found` and collapses identical consecutive lines into "… (repeated N×)", replacing the 20-line wall. macOS-specific paths guarded behind runtime.GOOS=="darwin"; Linux/Windows unaffected. Docs updated for docker_path, telemetry counter semantics, and a new troubleshooting entry. Related #696
Deploying mcpproxy-docs with
|
| Latest commit: |
6fa8ffa
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://ba4c1d5a.mcpproxy-docs.pages.dev |
| Branch Preview URL: | https://fix-mcp-2715-docker-spawn-pa.mcpproxy-docs.pages.dev |
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
…n test
insertCidfileIntoShellDockerCommand checked shellArgs[n-2]=="-c" to
distinguish a shell-wrapped command string from an unrecognised format.
On Windows with cmd.exe (no bash), WrapWithUserShell returns ["/c", cmdStr]
so the flag is "/c", not "-c"; the guard rejected it and silently skipped
cidfile insertion, breaking TestSetupDockerIsolationCidfileInsertionWithAbsolutePath
on windows-latest CI.
Accept both flags ("-c" and "/c") so cidfile insertion works on every
platform. Add TestInsertCidfileWindowsCmdFormat (pure unit, no real Docker)
to pin the Windows cmd format.
📦 Build ArtifactsWorkflow Run: View Run Available Artifacts
How to DownloadOption 1: GitHub Web UI (easiest)
Option 2: GitHub CLI gh run download 27625094138 --repo smart-mcp-proxy/mcpproxy-go
|
Dumbris
added a commit
that referenced
this pull request
Jun 17, 2026
…rPath (#696) Minimal #696 fix per board decision (MCP-2744): drop the direct-exec spawn migration and all DOCKER_* env experiments; keep only the negative-cache fix. Root cause of 'status shows docker_path but spawn used bare docker': under a LaunchAgent/sandbox the first resolution can fail because only the restricted login-shell leg failed, caching a negative for dockerPathNegativeTTL=60s. Early upstream spawns hit that window and fall back to bare 'docker'. The well-known -path probe is a pure os.Stat (never sandbox- or login-shell-restricted), so a live cached negative must not shadow a docker binary present at a well-known path right now. ResolveDockerPath now re-runs the cheap os.Stat probe before honoring a still-live negative and, on success, upgrades the cache to a permanent success. With the cache fixed, the existing post-#697 shell wrap runs the resolved ABSOLUTE docker path as the command token, so login-shell PATH is irrelevant and #696 is fixed with no Colima regression and no env handling change. The direct-exec migration + login-shell env hydration are deferred to a separate systematic task. TDD: TestResolveDockerPath_StatProbeOverridesLiveNegative asserts a successful os.Stat well-known probe wins over a still-live cached negative (no TTL expiry). Related #696 Co-Authored-By: Paperclip <noreply@paperclip.ing>
Dumbris
added a commit
that referenced
this pull request
Jun 17, 2026
…rPath (#696) (#699) Minimal #696 fix per board decision (MCP-2744): drop the direct-exec spawn migration and all DOCKER_* env experiments; keep only the negative-cache fix. Root cause of 'status shows docker_path but spawn used bare docker': under a LaunchAgent/sandbox the first resolution can fail because only the restricted login-shell leg failed, caching a negative for dockerPathNegativeTTL=60s. Early upstream spawns hit that window and fall back to bare 'docker'. The well-known -path probe is a pure os.Stat (never sandbox- or login-shell-restricted), so a live cached negative must not shadow a docker binary present at a well-known path right now. ResolveDockerPath now re-runs the cheap os.Stat probe before honoring a still-live negative and, on success, upgrades the cache to a permanent success. With the cache fixed, the existing post-#697 shell wrap runs the resolved ABSOLUTE docker path as the command token, so login-shell PATH is irrelevant and #696 is fixed with no Colima regression and no env handling change. The direct-exec migration + login-shell env hydration are deferred to a separate systematic task. TDD: TestResolveDockerPath_StatProbeOverridesLiveNegative asserts a successful os.Stat well-known probe wins over a still-live cached negative (no TTL expiry). Related #696 Co-authored-by: Paperclip <noreply@paperclip.ing>
This was referenced Jun 17, 2026
Dumbris
added a commit
that referenced
this pull request
Jun 17, 2026
…aunchd) — MCP-2751 (#701) * feat(shellwrap): hydrate login-shell env once at startup (macOS GUI/launchd) When mcpproxy is launched from a macOS GUI/launchd context (Launchpad, the SMAppService login item, or the tray spawning the core), the Go core inherits a launchd-minimal environment: it never sources the user's login shell, so it lacks Homebrew/Docker PATH entries and exported vars like DOCKER_HOST. This is the root-cause class behind the recent docker/PATH point-fixes (#697, #699). Add a one-time, allow-listed login-shell environment hydration at startup so every spawn path (docker lifecycle, stdio servers, uvx/npx, ResolveDockerPath, secureenv.BuildSecureEnvironment) inherits a correct environment with no call-site changes: - internal/shellwrap/hydrate.go: captureLoginShellEnv() sources `$SHELL -l -c 'env -0'` once (NUL-delimited, marker-bracketed, 5s timeout, sync.Once) and caches the parsed env. LoginShellPATH now reads PATH from this shared capture — one shell fork, not two. - HydrateFromLoginShell() gates on macOS + launchd-minimal PATH (no-op on terminal launches and non-macOS), merges PATH login-first, and applies a curated allow-list (DOCKER_*, HTTP(S)/NO_PROXY, NVM_DIR/ASDF_DIR/PYENV_ROOT/ VOLTA_HOME/HOMEBREW_PREFIX/COLIMA_HOME) set-if-empty via os.Setenv. Secrets are never imported; HOME/USER/SHELL are never touched; values are never logged (key names + lengths only). - secureenv: extend the default allow-list with the curated DOCKER_*/proxy/ tool-home keys so the hydrated vars survive the filter into upstream spawns. Scanner MinimalEnv stays narrow (credential-free). - cmd/mcpproxy: call HydrateFromLoginShell after logger setup, before any manager reads os.Environ(). Supersedes the per-spawn DOCKER_* env capture; the absolute-path docker probe (#697) and negative-cache fix (#699) remain as complementary fallbacks. Related #699 * fix(shellwrap): preserve intentionally-empty operator vars during hydration Codex review (PR #701, P2): the curated-key set-if-empty guard used `os.Getenv(key) != ""`, which treats an explicitly set-empty value the same as unset. An operator who sets `HTTPS_PROXY=` or `DOCKER_HOST=` to DISABLE an inherited value would have it overwritten from the login shell — violating the never-clobber-operator-values contract. Use os.LookupEnv to distinguish unset from intentionally-empty: any present key (even empty) is operator intent and is preserved. launchd never sets these to empty, so present-but-empty is always deliberate. Adds TestHydrate_PreservesIntentionallyEmptyVar. Co-Authored-By: Paperclip <noreply@paperclip.ing> * fix(shellwrap): alias-aware set-if-unset for proxy var spellings Codex round-2 (PR #701, P2): proxy vars (HTTP_PROXY/HTTPS_PROXY/NO_PROXY) are case-insensitive to HTTP clients, so the per-key LookupEnv guard could still clobber operator intent — e.g. operator sets `https_proxy=` (lowercase, to disable) but the login shell exports `HTTPS_PROXY=...`; the exact-key check on the uppercase spelling missed the lowercase presence and imported it anyway. Split the curated allow-list into single-spelling keys (DOCKER_*, tool-home) and an alias-aware proxy trio: for each logical proxy var, if EITHER spelling is already present in the process env (including intentionally empty), skip hydrating BOTH spellings. Adds TestHydrate_ProxyCaseAliasingPreservesOperatorIntent (daemon https_proxy='' + login HTTPS_PROXY=http://x → neither imported). Co-Authored-By: Paperclip <noreply@paperclip.ing> * fix(shellwrap): drop proxy vars, fix launch gate for pre-seeded PATH Codex round-3 scope reduction: - Remove HTTP_PROXY/HTTPS_PROXY/NO_PROXY from hydration and from secureenv DefaultEnvConfig allow-list. Proxy forwarding to every stdio upstream is out of scope; filed as a separate opt-in follow-up. - Fix the launch gate: hydration now triggers when PATH looks launchd- minimal OR any DOCKER_* curated var is absent. A GUI launcher can pre-seed PATH via /etc/paths yet still not export DOCKER_HOST from rc files, so PATH-comprehensiveness alone is insufficient. - PATH-merge stays gated on pathIsMinimal only — a comprehensive PATH is not widened even when the DOCKER_* gate fires. Tests: - Rename GateNoOpOnComprehensivePath → GateNoOpOnComprehensivePathAndAllDockerPresent (now requires both conditions, so all 5 DOCKER_* vars are pre-set) - Add ComprehensivePathStillHydratesDockerIfMissing — verifies the new secondary gate: comprehensive PATH, missing DOCKER_HOST → hydrates curated vars but leaves PATH unmodified - Drop PreservesIntentionallyEmptyVar and ProxyCaseAliasingPreservesOperatorIntent (proxy vars are no longer hydrated) - Update MinimalPathHydratesCuratedSet to assert proxy vars are NOT imported Co-Authored-By: Paperclip <noreply@paperclip.ing> --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
Dumbris
added a commit
that referenced
this pull request
Jun 17, 2026
…CP-2753) (#703) * fix(docker): direct-exec resolved docker binary for isolated spawn (MCP-2753) PR #697 resolved the docker CLI to an absolute path but still routed the isolated spawn through `$SHELL -l -c "<docker> run …"`, where the login shell re-derives $PATH from rc files and can drop the Docker Desktop bundle dir. The absolute path was only a token in the -c string, so docker-isolated servers still failed with `command not found: docker` when the CLI was not on the LaunchAgent/tray spawn PATH — even though `upstream_servers list` reported a good docker_path. On successful resolution to a VERIFIED absolute executable, setupDockerIsolation now returns the resolved binary as the exec command with raw docker argv (no shell wrap), mirroring the management path's newDockerCmd. It returns a shellWrapped flag so callers pick the matching cidfile helper: the new args-based insertCidfileIntoDockerArgs inserts `--cidfile` after the `run` token (platform-agnostic, sidestepping the -c vs /c quirk), while the login-shell fallback keeps the string-based insertCidfileIntoShellDockerCommand. The direct-exec branch is gated by isDirectExecutable (filepath.IsAbs + os.Stat + exec-bit). ResolveDockerPath's last resort runs `command -v docker` in the login shell, which can emit a shell function/alias/bare name; such a value is rejected and shell-wrapped instead of failing a direct exec. Login-shell wrap of bare `docker` is also kept on the resolution-failure fallback. The DOCKER_* daemon-env half of MCP-2753 is already satisfied by the startup login-shell hydration shipped in MCP-2751 (#701): DOCKER_* are hydrated into os.Environ() and carried through BuildSecureEnvironment's default allow-list, so the direct-exec'd docker process still reaches Colima/rootless/remote daemons without a per-spawn shell capture. Related #699 Related #696 * fix(docker): gate direct-exec on guaranteed daemon env (non-Darwin rootless regression) Codex round-4 (PR #703 REQUEST_CHANGES, P2): the direct-exec branch regressed non-Darwin rootless/remote Docker. Dropping the `$SHELL -l` wrap also drops rc-file env inheritance, and HydrateFromLoginShell (MCP-2751) is Darwin-only while BuildSecureEnvironment only forwards DOCKER_* already in os.Environ(). So on Linux a daemon whose DOCKER_HOST/DOCKER_CONTEXT lives only in .profile would be lost by direct-exec — the same DOCKER_* loss that made #699 keep the shell wrap. Add a second precondition (dockerDaemonEnvGuaranteed) alongside the verified- absolute-executable check: direct-exec only when the daemon env is guaranteed without the login shell — macOS (startup hydration captured DOCKER_* into os.Environ()) OR DOCKER_HOST/DOCKER_CONTEXT already present in the process env on any platform. Otherwise keep the `$SHELL -l` wrap so `docker run` inherits the rc-file daemon config; the wrap now prefers the resolved absolute path (still sidestepping the login shell's PATH re-derivation) and drops to bare "docker" only when resolution failed. dockerDaemonEnvGOOS is a package var so the Darwin branch is testable on a non-Darwin CI host. New tests: non-Darwin + DOCKER_HOST-only-in-rc stays shell-wrapped with the absolute path; non-Darwin + DOCKER_HOST in env direct- execs; and a table test for the gate. Related #703 Related #699 Co-Authored-By: Paperclip <noreply@paperclip.ing> --------- Co-authored-by: Paperclip <noreply@paperclip.ing>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem (GitHub #696)
On macOS with Docker Desktop installed the default way (without the optional, admin-gated "install CLI tools" step), the
dockerCLI lives only at/Applications/Docker.app/Contents/Resources/bin/docker— not on any standardPATHdir. mcpproxy spawned Docker-isolated upstream servers with baredockeron a sanitized PATH that omits the bundle dir, sodockerwas unresolvable at spawn:…while
upstream_servers listmisleadingly reporteddocker_status.available: true/ per-serverdocker_available: true.Fixes (TDD — failing test first for each)
internal/upstream/core/connection_docker.go:setupDockerIsolationresolvesdockerviashellwrap.ResolveDockerPath(mirroringnewDockerCmd) and spawns the absolute path before shell-wrapping; falls back to baredockeronly when resolution fails. Indirected via a package var so tests stub resolution with no real Docker install.internal/secureenv/manager.go: add/Applications/Docker.app/Contents/Resources/binon darwin (existence-filtered). Candidate list split into a pure, unit-testableunixCandidatePaths().docker_available—internal/server/mcp.go(resolveDockerStatus) +internal/runtime/runtime.go(IsDockerAvailable) +internal/management/diagnostics.go: report available only when the CLI is resolvable anddocker infosucceeds — no bare-dockerfallback that misreports. Newdocker_status.docker_pathexposes the resolved binary;available:falsewithdocker_path:""means unresolvable.internal/upstream/core/monitoring.go:formatRecentStderrleads a single actionable "Docker CLI not found" hint oncommand not found(zsh + bash forms) and collapses identical consecutive lines into… (repeated N×), replacing the 20-line wall.Cross-platform
macOS-specific paths guarded behind
runtime.GOOS=="darwin"; Linux/Windows unaffected. Shell-script fake-docker tests skip on Windows.Docs (ENG-9)
docs/docker-isolation.md: newdocker_pathfield, corrected telemetry-counter semantics, and acommand not found: dockertroubleshooting entry.Verification
go build ./cmd/mcpproxy(personal) andgo build -tags server ./cmd/mcpproxy— both pass.internal/secureenv(bundle-path inclusion),internal/upstream/core(absolute-path spawn, cidfile insertion, bare fallback, hint extraction, dedup, wall→hint),internal/server(unavailable/working/daemon-down).golangci-lint(CI v2 config) — 0 issues.-raceover touched packages running in CI.Out of scope
MaxConnectionRetriesunchanged.Related #696