fix: run SSE notification keepalive on a dedicated thread (survives GameThread stalls)#491
fix: run SSE notification keepalive on a dedicated thread (survives GameThread stalls)#491vladSirin wants to merge 1 commit into
Conversation
…ameThread stalls) The notification-stream SSE keepalive was driven by CleanupStaleRequests, which runs on the subsystem ticker (the GameThread). When the editor GameThread stalls longer than KeepaliveIntervalSeconds (recompile, PIE enter/exit, modal dialog, blocking asset import), no keepalive is sent, the client's SSE stream idles out, and the MCP session is re-initialized -- visible as repeated reconnects with no editor restart. Move the keepalive onto its own thread: - RunKeepaliveLoop() launched in Start() via Async(EAsyncExecution::Thread), stored as TFuture<void> KeepaliveLoopFuture. It waits on the existing StopEvent (5s tick) and calls SweepNotificationKeepalives(). - SweepNotificationKeepalives() is the keepalive sweep lifted out of step 4 of CleanupStaleRequests() (same snapshot-under-lock then write-outside-lock; keeps TouchSession() on success). - Shutdown() joins the loop after the accept thread is killed and Stop() has signaled bStopping/StopEvent, but before StopEvent is returned to the pool and before notification streams are closed. - CleanupStaleRequests() keeps the request-timeout / session-expiry / dead-stream reaping on the GameThread. No new lock nesting (per-stream WriteMutex is only taken outside NotificationStreamsMutex); the manual-reset StopEvent gives prompt, busy-spin-free shutdown. KeepaliveIntervalSeconds is unchanged. Verified in a production UE 5.7.1 project: a 90s total GameThread freeze left the notification stream + session intact (no "stream closed", no re-"initialize"), and a post-freeze call worked on the same session. Before the change the same stall dropped the stream. Closes ChiR24#488 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
👋 Thanks for your first Pull Request! We love contributions. Please ensure you have signed off your commits and followed the contribution guidelines. |
📝 WalkthroughSummary by CodeRabbit
WalkthroughThe SSE notification-stream keepalive is extracted from the game-thread ChangesDedicated Keepalive Thread for SSE Notification Streams
Sequence Diagram(s)sequenceDiagram
actor GameThread
participant CleanupStaleRequests
participant KeepaliveThread as RunKeepaliveLoop (background)
participant SweepNotificationKeepalives
participant NotificationStream
rect rgba(200, 100, 100, 0.5)
Note over GameThread,CleanupStaleRequests: Game thread (may stall)
GameThread->>CleanupStaleRequests: Tick (reap expired/dead streams only)
end
rect rgba(100, 150, 200, 0.5)
Note over KeepaliveThread,NotificationStream: Dedicated keepalive thread (unaffected by game-thread stalls)
loop Every ~5s until bStopping
KeepaliveThread->>SweepNotificationKeepalives: call
SweepNotificationKeepalives->>NotificationStream: snapshot eligible streams (under map lock)
SweepNotificationKeepalives->>NotificationStream: write `:keepalive` (outside map lock)
alt success
SweepNotificationKeepalives->>SweepNotificationKeepalives: update LastKeepaliveTime, TouchSession()
else failure
SweepNotificationKeepalives->>SweepNotificationKeepalives: mark stream for removal
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
plugins/McpAutomationBridge/Source/McpAutomationBridge/Private/MCP/Transport/McpNativeTransport.h (1)
79-81: ⚡ Quick winKeep the keepalive helpers private.
These helpers are currently public, but
SweepNotificationKeepalives()mutatesLastKeepaliveTime, whose contract says it is accessed only by the dedicated keepalive thread. Move these declarations belowprivate:so external callers cannot accidentally violate that single-writer assumption.Proposed visibility fix
- // Dedicated-thread keepalive (immune to GameThread stalls). - void RunKeepaliveLoop(); - void SweepNotificationKeepalives(); - // FRunnable interface virtual bool Init() override { return true; } virtual uint32 Run() override; virtual void Stop() override; private: + // Dedicated-thread keepalive (immune to GameThread stalls). + void RunKeepaliveLoop(); + void SweepNotificationKeepalives(); +🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/McpAutomationBridge/Source/McpAutomationBridge/Private/MCP/Transport/McpNativeTransport.h` around lines 79 - 81, The methods RunKeepaliveLoop() and SweepNotificationKeepalives() are currently declared in the public section of McpNativeTransport class, but since SweepNotificationKeepalives() mutates LastKeepaliveTime which should only be accessed by the dedicated keepalive thread, these methods must be moved to the private section. Relocate both method declarations from their current public location to below a private: access specifier to enforce this single-writer contract and prevent external callers from accidentally violating it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@plugins/McpAutomationBridge/Source/McpAutomationBridge/Private/MCP/Transport/McpNativeTransport.h`:
- Around line 79-81: The methods RunKeepaliveLoop() and
SweepNotificationKeepalives() are currently declared in the public section of
McpNativeTransport class, but since SweepNotificationKeepalives() mutates
LastKeepaliveTime which should only be accessed by the dedicated keepalive
thread, these methods must be moved to the private section. Relocate both method
declarations from their current public location to below a private: access
specifier to enforce this single-writer contract and prevent external callers
from accidentally violating it.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 72157f34-7674-4ede-b949-5b1945b60e4b
📒 Files selected for processing (3)
plugins/McpAutomationBridge/Source/McpAutomationBridge/Private/MCP/Transport/McpNativeTransport.hplugins/McpAutomationBridge/Source/McpAutomationBridge/Private/MCP/Transport/McpNativeTransportCleanup.cppplugins/McpAutomationBridge/Source/McpAutomationBridge/Private/MCP/Transport/McpNativeTransportLifecycle.cpp
Problem
The notification-stream SSE keepalive is driven by
CleanupStaleRequests, which runs on the subsystem ticker (the GameThread). When the editor GameThread stalls longer thanKeepaliveIntervalSeconds— recompile, PIE enter/exit, a modal dialog, a blocking asset import — no keepalive is sent, the client's SSE stream idles out, and the MCP session is re-initialized. It shows up as repeated reconnects with no editor restart.Fixes #488.
Change
Move the keepalive onto its own thread so it keeps firing through GameThread stalls:
RunKeepaliveLoop()launched inStart()viaAsync(EAsyncExecution::Thread, …), stored asTFuture<void> KeepaliveLoopFuture. It waits on the existingStopEvent(5s tick) and callsSweepNotificationKeepalives().SweepNotificationKeepalives()is the sweep lifted out of step 4 ofCleanupStaleRequests()— same snapshot-under-NotificationStreamsMutexthen write-outside-lock, keepsTouchSession()on success.Shutdown()joins the loop after the accept thread is killed andStop()has signaledbStopping/StopEvent, but beforeStopEventis returned to the pool and notification streams are closed.CleanupStaleRequests()keeps the request-timeout / session-expiry / dead-stream reaping on the GameThread.No new lock nesting (per-stream
WriteMutexis only taken outsideNotificationStreamsMutex); the manual-resetStopEventgives prompt, busy-spin-free shutdown.KeepaliveIntervalSecondsis unchanged.Verification
Implemented and verified in a production UE 5.7.1 project: a 90s total GameThread freeze left the notification stream + session intact (no
stream closed, no re-initialize), and a post-freeze call worked on the same session. Before the change, the same stall dropped the stream.3 files changed, +85 / −19.