Skip to content

fix: skip pasta port probe during snapshot restore#555

Merged
ejc3 merged 13 commits intomainfrom
fix-pasta-restore-probe
Mar 4, 2026
Merged

fix: skip pasta port probe during snapshot restore#555
ejc3 merged 13 commits intomainfrom
fix-pasta-restore-probe

Conversation

@ejc3
Copy link
Owner

@ejc3 ejc3 commented Mar 4, 2026

Summary

Fix pasta port forwarding reliability for snapshot-restored clones. After snapshot restore, the previous wait_for_port_forwarding() probe ran before the VM was loaded, poisoning pasta's state. Now we skip the probe during restore and instead verify forwarding after restore completes using an active ping + port probe.

Stacked on: main

Changes

  1. Skip pasta port probe during snapshot restorestart_pasta() accepts restore_mode flag; when true, skips wait_for_port_forwarding() since the guest doesn't exist yet
  2. Post-restore verificationverify_port_forwarding() pings guest from namespace to trigger ARP resolution, then probes each forwarded port to confirm end-to-end forwarding
  3. Active ping instead of passive ARP check — Uses ping -c 1 -W 0.2 to the guest instead of passive ip neigh show ARP table checks. With arp_accept=0 (Linux default), gratuitous ARP only updates existing entries, never creates new ones — the active ping forces a proper ARP exchange
  4. Hard error on verification failure — Changed from warn-only to .context()? so clones fail immediately if port forwarding can't be verified
  5. Propagate ping spawn errors.context("running ping via nsenter in namespace")? instead of silently ignoring command failures
  6. Add iputils-ping to nested containerContainerfile.nested was missing the ping binary needed by the new verification
  7. Stress testtest_clone_port_forward_stress_rootless: 3 clones × 20 concurrent HTTP requests, checking for 0-byte responses (pasta poisoning pattern)
  8. Diagnostics for CI failures — On first curl error per clone: verbose curl -v, ss -tlnp (listening), ss -tnp (connections), pgrep -a pasta (process alive check)
  9. Test helpers — Extract get_loopback_ip() and curl_check() to common/mod.rs, concurrent clone spawning via JoinSet

Test Results

Local stress test passes (60/60 OK, 0 errors):

$ sudo make test-root FILTER=clone_port_forward_stress STREAM=1
Clone pf-stress-clone-0: 20/20 OK, 0 zero-byte, 0 errors
Clone pf-stress-clone-1: 20/20 OK, 0 zero-byte, 0 errors
Clone pf-stress-clone-2: 20/20 OK, 0 zero-byte, 0 errors

ejc3 added 2 commits March 4, 2026 01:04
During snapshot restore, post_start() runs BEFORE the VM snapshot is
loaded into Firecracker. wait_for_port_forwarding() probes ports via
TcpStream::connect(), which pasta accepts (it's listening), but pasta
can't forward to the non-existent guest. This forces pasta to attempt
L2 forwarding that fails, potentially poisoning its internal connection
tracking state and causing subsequent data connections to return 0 bytes.

Add with_restore_mode() to PastaNetwork that skips the premature port
probe in post_start(). Port forwarding is verified later via
verify_port_forwarding() which runs after the VM is resumed and
fc-agent has sent its gratuitous ARP.
Stress test that spawns 3 clones from a snapshot with port forwarding
and makes 20 rapid HTTP requests to each clone. Catches the "connect
succeeded but 0 bytes" pattern seen in CI bench-vm failures.

Specifically detects pasta connection tracking poisoning where TCP
connect succeeds but no data flows through the forwarded port.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ffa6facb6b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +628 to 629
if !self.restore_mode && !self.port_mappings.is_empty() {
self.wait_for_port_forwarding().await?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve hard failure when restore-mode probe is skipped

When restore_mode is enabled, this branch skips the only startup-time port-forward readiness check that previously failed snapshot run on broken forwarding. The later check in cmd_snapshot_run (network.verify_port_forwarding()) only logs a warning on error, so a clone can now come up “healthy” while all published ports are unusable; this is a behavior regression specific to rootless snapshot restores with --publish where pasta never reaches a working forwarding state.

Useful? React with 👍 / 👎.

@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: none

Analysis

The core fix is correct and well-targeted. The ordering in common.rs confirms the problem:

  1. post_start() (line 990) — pasta starts, bridge created, port probe ran here
  2. load_snapshot() (line 1027) — VM memory/state loaded
  3. verify_port_forwarding() (line 1089 in snapshot.rs) — correct time for probe

Probing ports in step 1 means pasta tries L2 forwarding to a non-existent guest. The restore_mode flag correctly skips this, deferring to verify_port_forwarding() which runs after the VM is actually alive and fc-agent has sent its gratuitous ARP.

Scope check: Only PastaNetwork needs restore_modeRoutedNetwork doesn't implement post_start() (inherits the no-op default from the NetworkManager trait), so it has no premature port probe to skip.

Test: Well-structured stress test with proper feature gating (integration-slow), follows existing test patterns, and targets the specific failure mode (0-byte responses from pasta connection tracking poisoning).

Findings

No issues found. The fix is minimal, well-documented in code comments, and addresses a real race condition in the snapshot restore ordering.

Summary

Clean PR — the root cause analysis is thorough, the fix is surgical (one boolean flag + one condition check), and the new stress test validates the specific failure pattern. LGTM.


Review by Claude | Run

…one spawning

- Extract get_loopback_ip() and curl_check()/CurlResult to tests/common
  (was duplicated 4x across test files)
- Replace inline loopback IP parsing in test_snapshot_clone.rs (2 instances)
- Spawn clones concurrently via JoinSet instead of sequential loop
- Send HTTP requests concurrently across all clones (true stress test)
- Remove 50ms sleep between requests and local CurlResult/curl_check
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: none

Analysis

Core fix is correct. The restore_mode flag in PastaNetwork correctly skips the premature wait_for_port_forwarding() probe in post_start() during snapshot restore. The ordering in snapshot.rs confirms:

  1. post_start() — pasta starts, bridge created (port probe skipped with restore_mode)
  2. load_snapshot() — VM memory/state loaded into Firecracker
  3. verify_port_forwarding() — correct: runs after VM is alive, ARP resolved, still calls wait_for_port_forwarding()

Refactoring is clean. The third commit extracts get_loopback_ip()/curl_check()/CurlResult into tests/common/mod.rs — previously duplicated across test files. The new get_loopback_ip() actually improves error handling: returns Result with a clear error instead of the old unwrap_or_default() that silently returned an empty string.

Stress test is well-targeted. Feature-gated (integration-slow), concurrent clone spawning via JoinSet, and specifically checks for the 0-byte response pattern that indicates pasta connection tracking poisoning.

Findings

No issues found. As noted in previous review, the fix is minimal and well-documented. The additional refactoring commit (extracting common test helpers, concurrent spawning) is a good cleanup.

Summary

Clean PR. The fix is surgical (one boolean flag, one condition check), the test targets the specific failure mode, and the refactoring reduces code duplication. LGTM.


Review by Claude | Run

The caller in snapshot.rs wrapped verify_port_forwarding() in
if-let-Err that only logged a warning, allowing clones to come up
with broken port forwarding. Now that post_start() skips the port
probe in restore mode, verify_port_forwarding() is the only gate.
Propagate its error so clones fail fast on broken forwarding.
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: none

Analysis

Two previous reviews found no issues — the core fix and refactoring remain correct. This review focused on additional verification:

  1. Trait default safety: The verify_port_forwarding() promotion from warn-only to hard error (? propagation) is safe for non-pasta network types because the NetworkManager trait default returns Ok(()). Only PastaNetwork overrides it with actual verification logic.

  2. PR description vs commits: All 4 commits are accurately reflected in the PR description. No stale or missing items.

  3. Documentation: The changes are internal plumbing (restore ordering) — no user-facing CLI, flags, or behavior changes that require README/DESIGN.md updates.

  4. [LOW] Test box-drawing alignment: The stress test's decorative output boxes (lines 40, 42, 260, 263) have minor column misalignment — content lines are 1-4 characters narrower than the 63-column borders. Purely cosmetic in test output, does not affect test correctness.

Summary

Clean PR. The fix is correct, the error promotion is safe across all network backends, and the description matches the commits. LGTM.


Review by Claude | Run

…arding

With arp_accept=0 (Linux default), gratuitous ARP only updates existing
neighbor entries — never creates new ones. The old code checked `ip neigh
show` which would never find an entry because fc-agent's gratuitous arping
can't create one. Active ping forces a normal ARP exchange (request+reply)
that always creates a REACHABLE entry.

Changes:
- Replace `ip neigh show` with `ping -c 1 -W 0.2` to the guest
- Successful ping proves both ARP resolution and guest reachability
- 200ms ping timeout allows ~16 retries within 5s deadline
- Remove 10ms sleep (ping provides natural 200ms backoff)

Tested: make test-root FILTER=clone_port_forward_stress — 60/60 OK, 0 errors
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: medium

Findings

[MEDIUM] PR description missing newest commit: The description doesn't cover commit 139f095 (fix: use active ping instead of passive ARP check in verify_port_forwarding). This commit replaces the ip neigh show ARP table check with an active ping -c 1 -W 0.2 to the guest, which is a significant behavioral change. The description's "Root Cause" section still references the old ARP-checking approach.

[MEDIUM] Swallowed spawn errors in verify_port_forwarding() (pasta.rs:690-696): The new ping code uses if let Ok(ref out) = ping_result which silently discards command spawn failures (e.g., nsenter not found, permission denied). The old code used .context("checking ARP table in namespace")? which propagated spawn errors immediately. If the command fails to execute, the new code:

  1. Loops for 5 seconds with no sleep/backoff (busy loop — the old 10ms sleep and the ping's 200ms timeout only apply when the command actually runs)
  2. Then fails with a misleading "ARP for guest ... not resolved within 5s" error that hides the real problem

Fix: propagate the Err case from ping_result (or at least log + bail on spawn failure), similar to what the old code did.

[LOW] Stale comment in snapshot.rs:1085-1088: Says "probes each forwarded port to trigger and verify ARP resolution" but after the ping change, ARP is triggered by ping, and port probing only verifies forwarding. Minor inaccuracy.

Summary

The core fix (skip port probe during restore, use ping for ARP) is correct. The main issue is that command spawn errors in the new ping loop are silently swallowed, which would cause a busy-loop and misleading error message if nsenter/ping is unavailable.


Review by Claude | Run

@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔧 Auto-Fix Created

I found issues and created a fix PR: #556

Please review and merge the fix PR first, then this PR.

View Claude Run

The new ping-based ARP check used `if let Ok(...)` which silently
swallowed command spawn failures (nsenter/ping not found, permission
denied). This could cause a 5-second busy loop with a misleading
"ARP not resolved" error. Propagate spawn errors immediately with
`.context()?` to match the old code's behavior.

Also update stale comment in snapshot.rs to reflect the ping-based
ARP resolution approach.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: medium

Findings

[MEDIUM] PR description still missing two commits — As noted in previous review, the description doesn't cover commit 139f095 (use active ping instead of passive ARP check) or commit 3a8fbd1 (propagate ping spawn errors). The "Root Cause" section still references the old ARP-checking approach. The description should be updated to reflect the full set of changes, especially the ping-based ARP resolution which is a significant behavioral change.

Previously reported issues — now fixed:

  • ✅ Swallowed spawn errors in verify_port_forwarding() — fixed by commit 3a8fbd1 (merged from PR fix: propagate ping spawn errors in verify_port_forwarding #556). Spawn errors now properly propagate via .context("running ping via nsenter in namespace")?.
  • ✅ Stale comment in snapshot.rs — fixed by commit 3a8fbd1. Comment now accurately describes the ping → port probe two-step verification.

Summary

The code changes are correct and complete. The only remaining issue is the PR description being out of date with the last two commits — the description should mention the switch from passive ip neigh show ARP checking to active ping and the spawn error propagation fix.


Review by Claude | Run

- Fix curl_check to use -sS (show errors) instead of -s (silent)
- Add pre-storm verification curl for each clone before stress test
- On first curl error per clone, dump ss -tlnp and verbose curl
  to capture the actual error (connection refused, timeout, etc.)
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: medium

Findings

[MEDIUM] PR description still out of date with commits — As noted in previous review, the description doesn't cover commits 139f095 (use active ping instead of passive ARP check), 3a8fbd1 (propagate ping spawn errors), or the newest 651e73e (add diagnostics for clone port forward stress failures). The "Root Cause" section still references the old ARP-checking approach, and the "Changes" section only lists the original 5 bullets. Three commits' worth of behavioral changes (ping-based ARP resolution, spawn error propagation, pre-storm verification + diagnostic dumps) are undocumented in the PR description.

Summary

The code changes are all correct — the newest commit (651e73e) adds useful diagnostics (pre-storm verification, ss/verbose-curl dumps on first error, -sS for curl stderr). The only remaining issue is the PR description being stale — it should be updated to reflect all 7 commits before merge.


Review by Claude | Run

- Add iputils-ping to Containerfile.nested so ping-based ARP probe
  works in container CI (was missing, causing all container clones to
  fail with "ARP not resolved within 5s")
- Move loop comment above the loop in verify_port_forwarding
- Trim snapshot.rs comment to match actual behavior
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: medium

Findings

[MEDIUM] PR description still out of date — As noted in previous review, the description covers only the original 5 commits. It is now missing 3 additional commits: 139f095 (ping-based ARP), 3a8fbd1 (spawn error propagation), 651e73e (stress test diagnostics), and the newest 874f891 (add iputils-ping to nested container, simplify comments). The "Root Cause" section still references the old ip neigh show ARP-checking approach. Should be updated before merge.

Newest commit (874f891) is correct:

  • Adding iputils-ping to Containerfile.nested is necessary since verify_port_forwarding() now runs ping via nsenter — without this package, ping would be missing in nested container tests.
  • Comment simplifications in snapshot.rs and pasta.rs are clean and accurate.

Summary

No new code issues. The only remaining finding is the stale PR description (previously reported). All code changes across all 8 commits are correct and well-structured.


Review by Claude | Run

@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: medium

Findings

[MEDIUM] PR description still out of date — As noted in previous review, the description covers only the original 5 commits but there are now 9 commits. The newest commit 59cce3f (style: fix cargo fmt formatting) is trivial, but the description still omits the 4 substantive commits added earlier: ping-based ARP (139f095), spawn error propagation (3a8fbd1), stress test diagnostics (651e73e), and iputils-ping + comment cleanup (874f891). The "Root Cause" section's step 5 still says "ARP check" but the code now uses active ping. Should be updated before merge.

No new code issues. The newest commit is purely cargo fmt formatting fixes — no behavioral changes.

Summary

All code is correct. The only remaining issue is the stale PR description (reported in 4 previous reviews). No fix PR needed — this requires an author update to the PR description.


Review by Claude | Run

Add more diagnostic dumps on first curl error per clone:
- pgrep -a pasta: check if pasta is alive, find matching process
- ss -tnp: show active connections (not just listening sockets)
- All pasta processes listed for cross-clone comparison
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: none

Analysis

Reviewed the newest commit 17b796b (add pasta process and connection diagnostics for stress failures) on top of the 9 previously reviewed commits. This commit adds two more diagnostic checks to the stress test's first-error handler:

  • pgrep -a pasta: verifies whether the pasta process is still alive, filtering for the clone's IP and port
  • ss -tnp: shows established connections (not just listening sockets) to the clone's IP

The additions are well-structured — numbered comments (1-4) make the diagnostic sequence clear, and each diagnostic silently skips on failure (if let Ok) which is correct for optional debugging output.

Previously reported issues — all resolved:

  • ✅ Swallowed spawn errors in verify_port_forwarding() — fixed in commit 3a8fbd1
  • ✅ PR description now covers all 9 substantive commits (items 1-9 in description match the commit history)

No new issues found. The code changes are correct, the test diagnostics are useful for CI debugging, and the PR description is now in sync with the commits.

Summary

Clean PR. The newest commit adds useful diagnostic output for debugging CI failures — no code correctness issues. All previously reported issues have been resolved. LGTM.


Review by Claude | Run

@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔧 CI Auto-Fix

Created fix PR: #557

Root cause: iputils-ping was added to Containerfile.nested (commit 874f891) but not to the main Containerfile used for Container-x64/arm64 CI tests. The rust:1.83-bookworm base image doesn't include ping, so verify_port_forwarding() always timed out in container environments.

Note: Host-Root-arm64 test_clone_port_forward_stress_rootless failures ("Connection reset by peer") are the pre-existing pasta poisoning issue, not related to this fix.

View Claude Run

- Add curl_check_retry() to tests/common: retries HTTP with 500ms
  backoff and dumps network diagnostics on failure (ARP cache,
  namespace sockets, bridge links, VM listening sockets, curl -v)
- Extract dump_clone_network_diagnostics() as reusable test helper
- Capture ping stderr in verify_port_forwarding for better error
  messages when ARP resolution fails
- Use curl_check_retry in clone port forward tests (rootless, routed,
  stress) instead of single-shot curl
- Revert wait_for_port_forwarding to TCP-connect-only (appropriate
  for initial boot when guest app isn't ready yet)
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: none

Analysis

Reviewed the newest commit 829d04e (add curl retry with diagnostics, capture ping stderr) on top of the 10 previously reviewed commits.

Newest commit changes are correct:

  1. curl_check_retry() — well-structured retry loop with 500ms backoff and comprehensive diagnostics on failure (verbose curl, ARP cache, namespace sockets, bridge links, VM listening sockets). Correctly uses curl_check() with 1s per-attempt timeout inside the overall deadline.
  2. dump_clone_network_diagnostics() — good extraction as reusable helper, properly handles failures silently (appropriate for diagnostic output).
  3. Ping stderr capture (pasta.rs:694) — captures stderr on timeout for better error messages. Correct: piped stderr + String::from_utf8_lossy with "(empty)" fallback.
  4. Test migration to curl_check_retrytest_clone_port_forward_rootless, test_clone_port_forward_routed, and stress test pre-storm verification all correctly switched from single-shot curl to retry with 10s deadline. This addresses the race between L2 readiness and guest application readiness.

PR description vs commits: The description now covers items 1-9, which maps to all 11 commits. The newest commit (829d04e) is adequately covered by existing items 8 (diagnostics) and 9 (test helpers), though curl_check_retry() specifically isn't named. Minor — no action needed.

Previously reported issues — all resolved.

Summary

Clean PR. All 11 commits are correct. The newest additions (curl retry with diagnostics, ping stderr capture) improve test reliability and debugging. LGTM.


Review by Claude | Run

verify_port_forwarding uses nsenter + ping to check ARP resolution
after snapshot restore. In container tests, nsenter runs in the
container's mount namespace, so ping must be installed there.

Was added to Containerfile.nested but not Containerfile, causing:
  nsenter: failed to execute ping: No such file or directory
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: none

Analysis

Reviewed the newest commit 2113b9d (add iputils-ping to Containerfile for container tests) on top of the 11 previously reviewed commits.

Newest commit is correct and necessary. The main Containerfile (used for Container-x64/arm64 CI tests) was missing the iputils-ping package that verify_port_forwarding() needs when running nsenter + ping inside the container's mount namespace. This was the root cause of the CI failures in the container test environments — nsenter: failed to execute ping: No such file or directory.

PR description vs commits: Item 6 says "Add iputils-ping to nested container" but the newest commit adds it to the main Containerfile as well. Very minor description gap — doesn't warrant a [MEDIUM] since the change is self-explanatory from the commit message.

All previously reported issues are resolved:

  • ✅ Swallowed spawn errors — fixed in 3a8fbd1
  • ✅ PR description coverage — updated to cover all substantive commits
  • ✅ Missing ping in container envs — fixed by this commit

Summary

Clean PR. The newest commit completes the iputils-ping dependency addition across all container environments. All 12 commits are correct and the fix is ready. LGTM.


Review by Claude | Run

@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔧 CI Auto-Fix

Created fix PR: #558

Problem: verify_port_forwarding() called wait_for_port_forwarding() after the ping-based ARP verification, which did a bare TcpStream::connect() + immediate drop through pasta's L2 forwarding. This poisoned pasta's connection tracking table, causing subsequent HTTP connections to get 0-byte responses — failing the clone_http/rootless/nginx benchmark on both x64 and arm64.

Fix: Removed the TCP port probe from verify_port_forwarding(). The ping already confirms L2/L3 reachability; pasta is listening on the loopback ports. No TCP probe needed.

View Claude Run

- Dump full clone log instead of last 30 lines on clone_http failure
- Add namespace diagnostics: ARP cache, namespace sockets, bridge links
  via nsenter using holder PID from fcvm ls
- Add VM listening sockets via fcvm exec ss -tnl
- Dump full serve log instead of last 30 lines on clone_exec failure

These diagnostics match what tests/common/mod.rs already captures,
giving visibility into pasta splice state when bench fails.
@claude-claude
Copy link
Contributor

claude-claude bot commented Mar 4, 2026

🔍 Claude Review

SEVERITY: none

Analysis

Reviewed all 13 commits on this PR. The core fix and all supporting changes are correct.

Core fix verification: The restore_mode flag correctly skips wait_for_port_forwarding() in post_start() during snapshot restore, when the guest doesn't exist yet. The deferred verify_port_forwarding() in snapshot.rs (line 1089) correctly runs after VM resume and ARP resolution.

PR description vs commits: All 9 description items map to the 13 commits. Accurate and complete.

Documentation check: No doc updates needed — changes are internal plumbing (restore ordering, ARP resolution method, test helpers). No user-facing CLI, flags, or behavior changes.

Code correctness:

  • PastaNetwork::with_restore_mode() builder pattern is clean and well-documented
  • Error promotion from warn-only to .context()? is safe (trait default returns Ok(()))
  • Ping-based ARP resolution with -W 0.2 timeout gives ~16 retries within 5s deadline
  • Spawn errors properly propagated via .context("running ping via nsenter in namespace")?
  • iputils-ping added to both Containerfile and Containerfile.nested
  • Test helpers (get_loopback_ip, curl_check, curl_check_retry, dump_clone_network_diagnostics) are well-extracted with proper error handling
  • Bench diagnostics in exec.rs correctly use sync std::process::Command (benches are sync context)
  • Stress test is properly feature-gated (integration-slow), uses concurrent spawning via JoinSet

Note on PR #558: That open PR proposes removing wait_for_port_forwarding() from verify_port_forwarding(), claiming the TCP probe poisons pasta. However, the same wait_for_port_forwarding() runs in post_start() for every non-restore VM (line 628) when no guest exists at all, and those VMs work fine. This suggests the TCP probe in verify_port_forwarding() (where the guest IS alive) is not the issue — the original restore_mode fix was the correct solution.

All previously reported issues are resolved.

Summary

Clean PR. The fix is correct and well-targeted, the refactoring reduces duplication, and the stress test validates the specific failure pattern. LGTM.


Review by Claude | Run

@ejc3 ejc3 merged commit 601bfec into main Mar 4, 2026
18 checks passed
@ejc3 ejc3 deleted the fix-pasta-restore-probe branch March 4, 2026 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant