|
| 1 | +# Investigation: Device snapshot regressions |
| 2 | + |
| 3 | +## Summary |
| 4 | +The strongest current evidence points to an Apple tooling issue in the post-test device diagnostics path, not a deterministic regression in XcodeBuildMCP’s shared device workflow code. After a failing physical-device `xcodebuild test`, Xcode launches `devicectl diagnose`; when that runs on a TTY it prints `Password:` and blocks on interactive input, and when Xcode launches it from a background TTY process group the subprocess gets wedged. In non-TTY mode, the same path does not hang and instead exits with a concrete `CoreDeviceCLISupport.DiagnoseError` / `No provider was found` failure. |
| 5 | + |
| 6 | +## Symptoms |
| 7 | +- CLI `device build-and-run` success case returned `isError === true` in the snapshot suite. |
| 8 | +- CLI `device test` intentional failure snapshot was truncated after discovery/build-stage output and did not include the expected failure footer. |
| 9 | +- MCP `device test` intentional failure timed out after 300s in the suite. |
| 10 | + |
| 11 | +## Investigation Log |
| 12 | + |
| 13 | +### Initial assessment |
| 14 | +**Hypothesis:** A refactor changed shared device workflow behavior, likely in long-running command/result streaming or failure parsing. |
| 15 | +**Findings:** Targeted device test still passes, which suggests device discovery/build bootstrapping still works. Failures are concentrated in `build-and-run` and full-suite `test` failure handling. |
| 16 | +**Evidence:** `XcodeBuildMCP/src/snapshot-tests/suites/device-suite.ts:13-211` |
| 17 | +**Conclusion:** Needed broader context and direct reproductions. |
| 18 | + |
| 19 | +### Broad code-path review |
| 20 | +**Hypothesis:** Shared test/build code or transport handling regressed. |
| 21 | +**Findings:** The device suite and entrypoint are effectively unchanged from `main`; the meaningful snapshot-harness changes vs `main` are timeout increases and normalization/MCP envelope handling, not device workflow logic. |
| 22 | +**Evidence:** |
| 23 | +- `src/snapshot-tests/suites/device-suite.ts` is unchanged in substance versus `main`. |
| 24 | +- `src/snapshot-tests/__tests__/device.snapshot.test.ts` is unchanged versus `main`. |
| 25 | +- `src/snapshot-tests/harness.ts:8-36` increased CLI snapshot timeout from `120000` to `300000`. |
| 26 | +- `src/snapshot-tests/mcp-harness.ts:9-13,93-105` increased MCP timeout from `120000` to `300000` and now prefers `structuredEnvelope.didError`. |
| 27 | +- `src/snapshot-tests/normalize.ts` changes are normalization-only. |
| 28 | +**Conclusion:** The reported failures are not explained by a direct diff in the device suite itself. |
| 29 | + |
| 30 | +### CLI `build-and-run` direct reproduction |
| 31 | +**Hypothesis:** CLI is falsely classifying a successful build-and-run as an error because of render-session error latching. |
| 32 | +**Findings:** Isolated `device build-and-run` really failed on the first run. The exit code was `1`, and the output contained a real `devicectl` install failure: |
| 33 | +- `Unable to Install “Calculator”` |
| 34 | +- `ApplicationVerificationFailed` |
| 35 | +- `No code signature found` |
| 36 | +A second immediate rerun succeeded with exit code `0`. |
| 37 | +**Evidence:** |
| 38 | +- Direct run exit code capture: `/tmp/build-run.rc` contained `1`. |
| 39 | +- Direct run output: `/tmp/build-run.out` contained the `devicectl` install failure text. |
| 40 | +- Immediate rerun: `/tmp/build-run-2.rc` contained `0` and `/tmp/build-run-2.out` showed a complete success transcript. |
| 41 | +- The failed build log still ended with successful signing and `** BUILD SUCCEEDED **`: `/Users/cameroncooke/Library/Developer/XcodeBuildMCP/logs/build_run_device_2026-04-16T19-59-14-443Z_pid38742.log`. |
| 42 | +**Conclusion:** This is not primarily a false CLI `isError` classification bug. The combined flow hit a genuine but flaky device-side install failure. |
| 43 | + |
| 44 | +### Build artifact validation after failed `build-and-run` |
| 45 | +**Hypothesis:** The refactor produced an actually unsigned app artifact. |
| 46 | +**Findings:** The app at the resolved path was signed correctly, and a direct `device install` of that same app path succeeded. |
| 47 | +**Evidence:** |
| 48 | +- `device get-app-path` succeeded and pointed to `~/Library/Developer/XcodeBuildMCP/DerivedData/Build/Products/Debug-iphoneos/CalculatorApp.app`. |
| 49 | +- `xcrun codesign -dvvv` against that app showed a valid Apple Development signature, team identifier, `_CodeSignature`, and `embedded.mobileprovision`. |
| 50 | +- Direct install succeeded with exit code `0` and output `✅ App installed successfully.` in `/tmp/install-device.out`. |
| 51 | +**Conclusion:** The artifact itself was valid. The first `build-and-run` failure is best explained as transient `devicectl` / physical-device install flakiness, not a deterministic signing regression in the build step. |
| 52 | + |
| 53 | +### Fresh DerivedData isolation |
| 54 | +**Hypothesis:** Shared default DerivedData corruption is causing the `build-and-run` failure deterministically. |
| 55 | +**Findings:** Running `device build-and-run` with a fresh temporary `derivedDataPath` succeeded on the first try. |
| 56 | +**Evidence:** `/tmp/fresh-build-run.rc` contained `0`; `/tmp/fresh-build-run.out` contained a full success transcript. |
| 57 | +**Conclusion:** There is no evidence of a deterministic combined-flow failure tied solely to the current implementation. The failure is transient/stateful. |
| 58 | + |
| 59 | +### Direct device-test reproductions |
| 60 | +**Hypothesis:** `test-common.ts`, parser/finalization code, or CLI/MCP transport handling regressed for failing device tests. |
| 61 | +**Findings:** The failing full-device test works normally in isolation and in focused back-to-back runs. |
| 62 | +**Evidence:** |
| 63 | +- Direct CLI full failing test exited `1` and completed in about 10s with the full footer in `/tmp/device-test.out`. |
| 64 | +- Back-to-back direct CLI runs (targeted pass, then full fail) completed normally: `targeted rc=0 dur=8.7s`, `fullfail rc=1 dur=6.8s`. |
| 65 | +- Back-to-back snapshot MCP harness calls also completed normally: targeted `isError:false` in ~9.2s, full fail `isError:true` in ~6.1s. |
| 66 | +**Conclusion:** The underlying device-test code path is not deterministically broken. The suite-only failure requires a broader state/order interaction. |
| 67 | + |
| 68 | +### Post-test diagnostics reverse engineering |
| 69 | +**Hypothesis:** The hang happens after test execution, in Apple’s diagnostics collection path rather than in XcodeBuildMCP parsing/rendering. |
| 70 | +**Findings:** That hypothesis is confirmed. |
| 71 | +**Evidence:** |
| 72 | +- Raw PTY-backed `xcodebuild test` reproduced the exact symptom and captured `Password:` immediately after the final XCTest summary in `/tmp/pty-xcodebuild-test.typescript`. |
| 73 | +- During that stall, the live child process was: |
| 74 | + - `/Library/Developer/PrivateFrameworks/CoreDevice.framework/Versions/A/Resources/bin/devicectl diagnose --devices 00008140-000278A438E3C01C --no-finder --archive-destination ... --timeout 600` |
| 75 | +- Process state showed `devicectl` was running in its own process group on the same TTY, but **not** the foreground TTY process group: |
| 76 | + - `pgid=15787`, `tpgid=15723`, `state=T` |
| 77 | +- That means the subprocess was stopped after attempting terminal interaction from a background TTY process group. |
| 78 | +- Running the same `xcodebuild test` command **without** a TTY did not hang. It completed and printed Xcode’s own diagnostic failure text: |
| 79 | + - `Failure collecting diagnostics from devices` |
| 80 | + - `No provider was found` |
| 81 | + - `CoreDeviceCLISupport.DiagnoseError error 0` |
| 82 | +- Running `devicectl diagnose` directly under a PTY, with no `xcodebuild` involved, reproduced the same interactive path in `/tmp/pty-devicectl-diagnose.typescript`: |
| 83 | + - provisioning/provider error text |
| 84 | + - Apple privacy notice |
| 85 | + - `Password:` |
| 86 | + - sending a blank line produced `Sorry, try again.` and another `Password:` prompt |
| 87 | +- Running `devicectl diagnose` directly **without** a TTY did not prompt; it failed fast and wrote a partial bundle. |
| 88 | +**Conclusion:** The bad path is in Apple’s `devicectl diagnose` TTY behavior. `xcodebuild` makes it worse by launching that interactive subprocess from a background terminal process group, which wedges the run. |
| 89 | + |
| 90 | +## Root Cause |
| 91 | +The root cause of the hang is Apple’s post-test diagnostics flow for failing physical-device tests: |
| 92 | + |
| 93 | +1. After the failing test summary, `xcodebuild test` launches `devicectl diagnose` to collect device diagnostics. |
| 94 | +2. On a TTY, `devicectl diagnose` enters an interactive authorization path and prints `Password:`. |
| 95 | +3. When launched by `xcodebuild`, that subprocess is in a background TTY process group, so terminal input cannot be handled normally and the subprocess gets wedged. |
| 96 | +4. In non-TTY mode, the same path does not block on a password prompt; it exits with the real underlying diagnostics failure (`No provider was found`, `CoreDeviceCLISupport.DiagnoseError error 0`). |
| 97 | + |
| 98 | +This means the snapshot hang is not primarily caused by: |
| 99 | +- `src/utils/test-common.ts` |
| 100 | +- `src/utils/xcresult-test-failures.ts` |
| 101 | +- `src/utils/xcodebuild-event-parser.ts` |
| 102 | +- `src/utils/renderers/cli-text-renderer.ts` |
| 103 | +- `src/utils/command.ts` |
| 104 | +- `src/snapshot-tests/mcp-harness.ts` |
| 105 | + |
| 106 | +The earlier transient `build-and-run` install failure is real, but it is a separate flaky physical-device issue, not the cause of the `Password:` hang. |
| 107 | + |
| 108 | +## Eliminated hypotheses |
| 109 | +- **Deterministic CLI error-latch bug for `build-and-run`** — ruled out by the captured real install failure text and exit code `1`. |
| 110 | +- **Deterministic parser/renderer regression dropping the test footer** — ruled out by successful isolated CLI and MCP failing-test runs. |
| 111 | +- **Deterministic MCP transport deadlock** — ruled out by successful focused MCP harness pass→fail reproduction. |
| 112 | +- **Deterministic signing regression in the built app artifact** — ruled out by successful codesign inspection and direct `device install` of the same app path. |
| 113 | + |
| 114 | +## Recommendations |
| 115 | +1. Treat the `Password:` hang as an Apple `devicectl diagnose` / Xcode physical-device diagnostics problem, not as evidence of a deterministic XcodeBuildMCP refactor regression. |
| 116 | +2. For XcodeBuildMCP’s automated/device-test paths, prefer **non-interactive process mode** and preserve full stderr/stdout so Xcode’s explicit diagnostics failure is surfaced instead of a hung terminal prompt. |
| 117 | +3. Add timeout diagnostics around failing physical-device test runs that explicitly note whether the process appears to be stuck in post-test diagnostics collection. |
| 118 | +4. Keep the timeout at `120_000`; increasing it just makes this Apple diagnostics wedge slower to fail. |
| 119 | +5. Separately from the hang, keep the earlier `build-and-run` flake in mind as a real but distinct physical-device reliability issue. |
| 120 | + |
| 121 | +## Preventive Measures |
| 122 | +- Keep physical-device snapshot tests minimal and isolated. |
| 123 | +- Avoid chaining many mutating device operations in one snapshot file. |
| 124 | +- Capture direct per-step logs/artifacts for device tests so transient `devicectl` failures are visible without rerunning the whole file. |
| 125 | +- Be careful about interpreting longer timeouts as fixes; here they mainly make the suite slower when the device gets wedged. |
0 commit comments