fix: handle ClosedResourceError in StreamableHTTP message router #1384

Edison-A-N · 2025-09-21T06:12:09Z

Fix Race Condition in StreamableHTTP Transport (Closes #1363)

Motivation and Context

Starting from v1.12.0, MCP servers in HTTP Streamable mode experience a race condition that causes ClosedResourceError exceptions when requests fail validation early (e.g., due to incorrect Accept headers). This issue affects server reliability and can be reproduced consistently with fast-failing requests.

The race condition occurs because:

Message router enters async for write_stream_reader loop
write_stream_reader calls checkpoint() in receive(), yielding control
Request validation fails early and returns immediately
Transport termination closes all streams including write_stream_reader
Message router resumes and encounters closed stream, raising ClosedResourceError

This fix ensures graceful handling of stream closure scenarios without propagating exceptions that could destabilize the server.

How Has This Been Tested?

Test Suite

Added comprehensive test suite in tests/issues/test_1363_race_condition_streamable_http.py that reproduces the race condition:

Invalid Accept Headers Test:
- Missing application/json in Accept header
- Missing text/event-stream in Accept header
- Completely invalid Accept header
Invalid Content-Type Test:
- Incorrect Content-Type header
Log Analysis:
- Captures server logs from separate process
- Verifies no ClosedResourceError exceptions occur
- Checks for "Error in message router" messages
- Validates graceful error handling

Test Execution

Tests run in isolated processes to capture real server behavior
Server runs in stateless mode to trigger race condition
Multiple request scenarios tested to ensure comprehensive coverage
Log analysis confirms fix prevents exception propagation

Breaking Changes

None. This is a bug fix that maintains full backward compatibility.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Documentation
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have added or updated documentation as needed

Additional context

Implementation Details

The fix adds explicit exception handling for anyio.ClosedResourceError in the message router loop:

except anyio.ClosedResourceError:
    if self._terminated:
        logging.debug("Read stream closed by client")
    else:
        logging.exception("Unexpected closure of read stream in message router")

This approach:

Graceful Handling: Prevents exception propagation that could crash the server
Smart Logging: Distinguishes between expected termination and unexpected closure
Minimal Impact: No performance overhead or behavioral changes
Robust: Handles the race condition without complex synchronization

Related Issues

Fixes Race Condition in StreamableHTTP Transport Causes ClosedResourceError #1363 - Race Condition in StreamableHTTP Transport Causes ClosedResourceError
Related to MCP server in the HTTP Streamable mode broken #1190 - MCP servers in HTTP Streamable mode broken starting from v1.12.0

thomasst · 2025-09-22T18:43:13Z

This seems to silence the error. Is this the correct approach given that for me (and others in #1219 / #1190) the error happens on every request, so it doesn't appear to just be a race condition?

Edison-A-N · 2025-09-23T01:54:06Z

Hi,

In anyio's Implementation

1. Conditions for Iteration Termination

Class inheritance:

MemoryObjectReceiveStream -> ObjectReceiveStream -> UnreliableObjectReceiveStream

As we can see in the implementation of UnreliableObjectReceiveStream.__anext__:

async def __anext__(self) -> T_co:
    try:
        return await self.receive()
    except EndOfStream:
        raise StopAsyncIteration from None

That is, the EndOfStream exception will terminate the iteration.

2. When to Raise EndOfStream or ClosedResourceError

MemoryObjectReceiveStream.receive -> receive_nowait:

def receive_nowait(self) -> T_co:
    """
    Receive the next item if it can be done without waiting.

    :return: the received item
    :raises ~anyio.ClosedResourceError: if this send stream has been closed
    :raises ~anyio.EndOfStream: if the buffer is empty and this stream has been
        closed from the sending end
    :raises ~anyio.WouldBlock: if there are no items in the buffer and no tasks
        waiting to send
    """

All ClosedResourceError exceptions are based on this check:

if self._closed:
    raise ClosedResourceError

And of course, self._closed becomes True originates from:

class MemoryObjectReceiveStream:
    ...
    ...
    def close(self) -> None:
        """
        Close the stream.

        This works the exact same way as :meth:`aclose`, but is provided as a special
        case for the benefit of synchronous callbacks.

        """
        if not self._closed:
            self._closed = True
            self._state.open_receive_channels -= 1
            if self._state.open_receive_channels == 0:
                send_events = list(self._state.waiting_senders.keys())
                for event in send_events:
                    event.set()

Review of Known Issues

In issue #1219, the debug information clearly shows _closed = True (visible in the debug for : second screenshot).

The traceback in issue #1190 also lists the root cause of the error. It occurs when if self._closed is True.

In fact, looking at the anyio implementation above, it's very clear that ClosedResourceError is raised because the stream has been closed.

Why This Implementation is Appropriate

This implementation is not "silencing the error". In fact, in scenarios where multiple coroutines operate on the same stream simultaneously, checking whether the stream has been closed is a necessary operation. Since anyio.MemoryObjectReceiveStream chooses to raise ClosedResourceError rather than support automatic iteration termination, we need to actively check during for loop iteration.

When checking externally, we simultaneously check self._terminated, which ensures that external calls know whether the ClosedResourceError exception is due to active closure. If not, it still outputs logger.exception to the logs.

ofek · 2025-10-21T19:22:18Z

I describe an easy way to reproduce this issue here #1190 (comment)

Your test case is quite similar to the underlying implementation of the example.

ofek

The test file needs a new line at the end.

Edison-A-N · 2025-10-29T14:17:22Z

Hi, all! I recently tried to review the exception occurrence situation again. Here are some additional insights I'd like to share. Welcome any feedback and guidance!

Core Problem

Synchronous flow execution completes, causing checkpoint to be unable to yield and return for execution, leading to ClosedResourceError

Test Case Analysis

Based on three test scenarios in test_1363_race_condition_streamable_http.py:

Invalid Accept Headers - Missing text/event-stream or application/json
Invalid Content-Type - Not application/json
JSON Response Mode - Specific code path when json_response=True

Execution Flow Analysis

1. System Startup Phase

# streamable_http_manager.py:170-187
async def run_stateless_server():
    async with http_transport.connect() as streams:
        # Start message router task
        tg.start_soon(message_router)
        # Start MCP server
        await self.app.run(streams)

2. Message Router Suspension

# streamable_http.py:831
async def message_router():
    async for session_message in write_stream_reader:  # Key point
        # Process message routing
        # After processing, return to loop start
        # Call checkpoint() again to suspend and wait for next message

Key Mechanism: The async for loop internally calls checkpoint() to yield control and wait for new messages.

3. Request Processing Phase

3.1 Invalid Request Headers Scenario (Fast Failure)

# streamable_http.py:315-323
async def _handle_post_request():
    # Synchronous Accept header validation
    has_json, has_sse = self._check_accept_headers(request)  # Synchronous
    if not (has_json and has_sse):
        response = self._create_error_response(...)  # Synchronous
        await response(scope, receive, send)  # Only yield point
        return  # Immediate return

3.2 JSON Response Mode Scenario (After Processing Response)

# streamable_http.py:397-439
if self.is_json_response_enabled:

    # Wait for response
    async for event_message in request_stream_reader:  # Line 408
        if isinstance(event_message.message.root, JSONRPCResponse | JSONRPCError):
            response_message = event_message.message
            break

    # Send response
    response = self._create_json_response(response_message)
    await response(scope, receive, send)

    # Clean up resources
    await self._clean_up_memory_streams(request_id)  # Synchronous

4. Transport Termination Phase

# streamable_http_manager.py:189-193
# Immediately terminate after processing request
await http_transport.handle_request(scope, receive, send)
await http_transport.terminate()  # Immediately close streams

# streamable_http.py:623-653
async def terminate(self):
    self._terminated = True
    # Close all streams
    if self._write_stream_reader is not None:
        await self._write_stream_reader.aclose()  # Close stream used by message router

Precise Timing of Race Condition

Key Timeline

Timeline:
T1: Message router starts, enters async for loop
T2: write_stream_reader.receive() calls checkpoint() and suspends
T3: Main coroutine processes request, validation fails or completes
T4: Main coroutine sends response (await response) - No cooperation point
T5: Main coroutine executes synchronous cleanup code - No cooperation point
T6: Main coroutine calls terminate() - Synchronous code
T7: terminate() closes write_stream_reader - Synchronous code
T8: Message router resumes, tries to continue iteration
T9: Discovers stream is closed, throws ClosedResourceError

Core Problem

1. When validation fails early (T3), the main coroutine executes synchronous error handling and immediately terminates the transport, causing the same race condition as described in the test cases.

2. After T4, the main coroutine executes all synchronous code with no opportunity to yield control, preventing the message router from completing its current iteration before the stream is closed.

Root Cause

The message router suspends via checkpoint() in the async for loop, while the main coroutine executes synchronous code and quickly closes the stream without giving the message router a chance to complete its current iteration.

Solution

Direct ClosedResourceError Handling

Referencing the handling practice for request_stream, the appropriate solution is to directly catch ClosedResourceError:

# streamable_http.py:862-871 (existing code)
if request_stream_id in self._request_streams:
    try:
        # Send both the message and the event ID
        await self._request_streams[request_stream_id][0].send(EventMessage(message, event_id))
    except (
        anyio.BrokenResourceError,
        anyio.ClosedResourceError,  # Already catching this error
    ):
        # Stream might be closed, remove from registry
        self._request_streams.pop(request_stream_id, None)

Solution: Directly catch ClosedResourceError in the message router's async for loop:

# streamable_http.py:829-887
async def message_router():
    try:
        async for session_message in write_stream_reader:
            # Process message routing logic
            # ...
    except anyio.ClosedResourceError:
        # Stream closed, graceful exit
        if self._terminated:
            logger.debug("Read stream closed by client")
        else:
            logger.debug("Read stream closed unexpectedly")
    except Exception:
        logger.exception("Error in message router")

This way, when the main coroutine closes the stream, the message router will gracefully catch ClosedResourceError and exit, rather than letting the exception propagate.

I would like to emphasize once again that directly catching the error is appropriate and correct. Of course, I also welcome more guidance on this solution, and I'm very happy to learn from everyone. 😊😊

Notes: - Disable returning JSON responses when running in stateless mode. This change prevents the `anyio.ClosedResourceError` that can occur when the response stream is closed unexpectedly. - This resolves an observed anyio.ClosedResourceError. More discussion and context are available in the linked threads. - An alternative is to catch this error inside message_router, handle it gracefully, and log it instead of letting it propagate to the top-level; that approach is left for a follow-up. Github Issue Threads: - jlowin/fastmcp#2083 - modelcontextprotocol/python-sdk#1384 (comment)

ofek · 2025-12-02T22:27:28Z

Is this blocked on reviews?

Kludex

I think we need to stop using uvicorn in every test in this repository. It kills me (and the test suite...).

We shouldn't rely on network requests in the test suite.

That said, the change seems fine.

@maxisbey Can you please decide what do you want to do with the test here?

maxisbey · 2025-12-03T19:12:22Z

I think we should merge for now since it's a high priority issue and fixes a bug affecting people. Fixing the uvicorn in test suites is something that's been mentioned a few times and we should fix in some follow ups:

ofek · 2025-12-03T22:40:40Z

It looks like the merge commit that was just introduced impacted coverage and now CI is failing because a handful of lines are no longer covered.

Edison-A-N requested review from a team and felixweinberger September 21, 2025 06:12

maxisbey added the bug Something isn't working label Sep 22, 2025

Edison-A-N added 7 commits September 25, 2025 20:49

fix: handle ClosedResourceError in StreamableHTTP message router

2955d84

Add type annotations to race condition test functions

f8072c9

fix: add connection waiting mechanism for Windows compatibility

6a7ecd5

fix: escape path in test script content using repr()

3f3c8ae

refactor(test_1363): eliminate temporary files by using python -c

c76b28d

fix: use logger instance instead of logging module in streamable_http

1b8b779

refactor: inline server code in test_1363 to avoid external imports

2c02f06

Edison-A-N force-pushed the fix/race-condition-streamable-http-1363 branch from 126d1ed to 2c02f06 Compare September 25, 2025 12:49

felixweinberger added the needs more eyes label Sep 26, 2025

felixweinberger mentioned this pull request Oct 3, 2025

Race Condition in StreamableHTTP Transport Causes ClosedResourceError #1363

Closed

2 tasks

felixweinberger added needs maintainer action Potentially serious issue - needs proactive fix and maintainer attention and removed needs more eyes labels Oct 13, 2025

joshight mentioned this pull request Oct 15, 2025

Message router error after successful tools list call jlowin/fastmcp#2083

Open

test(test_1363): add race condition test for json_response=True scenario

a1632e1

ofek reviewed Oct 29, 2025

View reviewed changes

chore(test_1363): remove extra blank line and add newline

02ae5e6

maxisbey mentioned this pull request Dec 2, 2025

MCP server in the HTTP Streamable mode broken #1190

Closed

Kludex approved these changes Dec 3, 2025

View reviewed changes

Merge branch 'main' into fix/race-condition-streamable-http-1363

48946dd

maxisbey enabled auto-merge (squash) December 3, 2025 19:12

refactor(test_1363): use in-process ASGI testing instead of subprocess

2f9ac44

auto-merge was automatically disabled December 4, 2025 09:21
Head branch was pushed to by a user without write access

Edison-A-N and others added 2 commits December 4, 2025 17:31

test(test_1363): remove unused code to improve coverage

cfa36f7

Merge branch 'main' into fix/race-condition-streamable-http-1363

8fe55a0

Kludex merged commit 9ed0b93 into modelcontextprotocol:main Dec 4, 2025
18 checks passed

Edison-A-N deleted the fix/race-condition-streamable-http-1363 branch December 4, 2025 11:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle ClosedResourceError in StreamableHTTP message router #1384

fix: handle ClosedResourceError in StreamableHTTP message router #1384

Edison-A-N commented Sep 21, 2025

Uh oh!

thomasst commented Sep 22, 2025

Uh oh!

Edison-A-N commented Sep 23, 2025

Uh oh!

ofek commented Oct 21, 2025

Uh oh!

ofek left a comment

Uh oh!

Edison-A-N commented Oct 29, 2025

Uh oh!

ofek commented Dec 2, 2025

Uh oh!

Kludex left a comment •

edited

Loading

Uh oh!

maxisbey commented Dec 3, 2025

Uh oh!

ofek commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

fix: handle ClosedResourceError in StreamableHTTP message router #1384

fix: handle ClosedResourceError in StreamableHTTP message router #1384

Conversation

Edison-A-N commented Sep 21, 2025

Fix Race Condition in StreamableHTTP Transport (Closes #1363)

Motivation and Context

How Has This Been Tested?

Test Suite

Test Execution

Breaking Changes

Types of changes

Checklist

Additional context

Implementation Details

Related Issues

Uh oh!

thomasst commented Sep 22, 2025

Uh oh!

Edison-A-N commented Sep 23, 2025

In anyio's Implementation

1. Conditions for Iteration Termination

2. When to Raise EndOfStream or ClosedResourceError

Review of Known Issues

Why This Implementation is Appropriate

Uh oh!

ofek commented Oct 21, 2025

Uh oh!

ofek left a comment

Choose a reason for hiding this comment

Uh oh!

Edison-A-N commented Oct 29, 2025

Core Problem

Test Case Analysis

Execution Flow Analysis

1. System Startup Phase

2. Message Router Suspension

3. Request Processing Phase

3.1 Invalid Request Headers Scenario (Fast Failure)

3.2 JSON Response Mode Scenario (After Processing Response)

4. Transport Termination Phase

Precise Timing of Race Condition

Key Timeline

Core Problem

Root Cause

Solution

Direct ClosedResourceError Handling

Uh oh!

ofek commented Dec 2, 2025

Uh oh!

Kludex left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maxisbey commented Dec 3, 2025

Uh oh!

ofek commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Kludex left a comment •

edited

Loading