Skip to content

fix(tracing): improve error handling in sync generator finalization #506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

viniciusdsmello
Copy link
Contributor

Fix Context Variable Error in Sync Generator Tracing for Multi-threaded Environments

Problem Description

The @trace decorator for synchronous generators crashes in multi-threaded web frameworks like FastAPI and OpenWebUI with the following error:

ValueError: <Token var=<ContextVar name='current_step' at 0xffff98e7e520> at 0xffff982d7280> was created in a different Context

Root Cause

This error occurs when:

  1. A traced generator function is called in one execution context (main thread)
  2. The generator is consumed in a different execution context (worker thread via starlette.concurrency.iterate_in_threadpool)
  3. The _current_step.reset(token) call in _finalize_sync_generator_step() attempts to reset a context variable token that was created in the original context

Example Scenario

# FastAPI/OpenWebUI pattern that triggers the bug:
@app.post("/stream")
async def stream_endpoint():
    @trace()
    def my_generator():
        yield "chunk1"
        yield "chunk2"
    
    # Generator created in main async context
    gen = my_generator()
    
    # FastAPI consumes generator in thread pool - CRASH!
    return StreamingResponse(gen)

The crash happens because Python's contextvars are thread-local, and tokens cannot be reset across thread boundaries.

Impact

  • Application Crashes: Complete request failure in web frameworks
  • No Trace Data: Traces are lost due to finalization failure
  • Blocking Adoption: Prevents use of Openlayer tracing in FastAPI/OpenWebUI streaming scenarios

Proposed Solution

Add graceful error handling for context variable operations in multi-threaded scenarios:

Changes Made

  1. Protected Context Variable Reset:

    def _finalize_sync_generator_step(...):
        try:
            _current_step.reset(token)
        except ValueError:
            # Context variable was created in a different context
            logger.debug("Context variable reset failed - generator consumed in different context")
  2. Protected Finalization Calls:

    try:
        _finalize_sync_generator_step(...)
    except Exception as finalize_exc:
        logger.error(f"Error finalizing sync generator step: {finalize_exc}")
    # Generator continues yielding chunks normally

Behavior Changes

Scenario Before After
Same-thread consumption Works normally Works normally (no change)
Cross-thread consumption Crashes with ValueError Works with debug log
Trace data collection Lost on crash Preserved and sent to Openlayer
Application streaming Broken Continues normally

Why This Solution is Safe

1. Preserves Normal Operation

  • Same-thread scenarios continue to work exactly as before
  • All tracing functionality is maintained
  • No performance impact on normal use cases

2. Automatic Memory Management

  • Context variables are automatically cleaned up when execution contexts end
  • No manual cleanup needed in cross-context scenarios
  • Python's garbage collector handles Step object cleanup

3. Limited Impact Scope

  • Context variable pollution is isolated to individual threads
  • Most application code doesn't call get_current_step() directly
  • Error only affects internal tracing state management

4. Comprehensive Error Handling

  • Finalization errors are logged but don't crash the application
  • Trace data is still collected and sent to Openlayer
  • Streaming behavior is preserved

Alternative Solutions Considered

1. Store Original Context Reference

class TracedSyncGenerator:
    def __init__(self):
        self._original_context = contextvars.copy_context()
    
    def finalize(self):
        self._original_context.run(lambda: _current_step.reset(self._token))

Rejected because:

  • Adds significant complexity
  • Requires storing context references
  • May have unexpected interactions with async frameworks

2. Disable Generator Tracing in Multi-threaded Environments

if threading.current_thread() != threading.main_thread():
    # Skip tracing
    return original_function

Rejected because:

  • Reduces observability in production environments
  • Many legitimate use cases would lose tracing
  • Difficult to detect multi-threaded consumption patterns

3. Require Manual Context Management

@trace(context_aware=True)
def my_generator(context=None):
    # User manages context manually

Rejected because:

  • Breaking API change
  • Shifts complexity to end users
  • Incompatible with existing decorator patterns

4. Use Weak References for Memory Safety

import weakref
self._step_ref = weakref.ref(step)

Rejected because:

  • Over-engineering for this specific problem
  • Adds complexity without clear benefit
  • Normal garbage collection is sufficient

Testing

Test Coverage

  1. Same-thread consumption: Verified normal operation continues
  2. Cross-thread consumption: Verified no crashes, traces still created
  3. Multiple concurrent generators: Verified thread isolation
  4. Error scenarios: Verified graceful handling without application impact

Validation Script

import threading
from concurrent.futures import ThreadPoolExecutor

@trace()
def streaming_generator():
    for i in range(5):
        yield f"chunk {i}"

# Test cross-thread consumption (previously crashed)
with ThreadPoolExecutor() as executor:
    future = executor.submit(list, streaming_generator())
    result = future.result()  # Works without crash

Backward Compatibility

  • API: No changes to public API
  • Behavior: Same-thread scenarios unchanged
  • Dependencies: No new dependencies
  • Configuration: No configuration changes required

Risk Assessment

Low Risk

  • Single-threaded applications: No changes in behavior
  • Same-thread generator consumption: Normal cleanup continues
  • Short-lived generators: Minimal memory impact

Medium Risk

  • High-throughput multi-threaded servers: Context variables may persist longer
  • Long-running worker threads: Potential for context variable accumulation

Mitigation

  • Context variables are automatically cleaned up when threads/contexts end
  • Debug logging provides visibility into cross-context scenarios
  • Error handling prevents application crashes

Conclusion

This change enables Openlayer tracing to work correctly in modern async web frameworks while maintaining full backward compatibility. The solution is conservative, focusing on graceful error handling rather than complex architectural changes.

The fix addresses a critical blocker for adoption in FastAPI, OpenWebUI, and similar frameworks without compromising the reliability or performance of existing use cases.

@viniciusdsmello viniciusdsmello marked this pull request as ready for review August 15, 2025 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant