feat: add streaming HITL support and complete human-in-the-loop implementation

mjschock · claude · mjschock · commit 74c50fd5f67c · 2025-11-01T11:13:35.000-07:00
This commit completes the human-in-the-loop (HITL) implementation by adding full streaming support, matching the TypeScript SDK functionality. **Streaming HITL Support:** 1. **ToolApprovalItem Handling** (_run_impl.py:67, 1282-1284) - Added ToolApprovalItem to imports - Handle ToolApprovalItem in stream_step_items_to_queue - Prevents "Unexpected item type" errors during streaming 2. **NextStepInterruption in Streaming** (run.py:1222-1226) - Added NextStepInterruption case in streaming turn loop - Sets interruptions and completes stream when approval needed - Matches non-streaming interruption handling 3. **RunState Support in run_streamed** (run.py:890-905) - Added full RunState input handling - Restores context wrapper from RunState - Enables streaming resumption after approval 4. **Streaming Tool Execution** (run.py:1044-1101) - Added run_state parameter to _start_streaming - Execute approved tools when resuming from interruption - Created _execute_approved_tools instance method - Created _execute_approved_tools_static classmethod for streaming 5. **RunResultStreaming.to_state()** (result.py:401-451) - Added to_state() method to RunResultStreaming - Enables state serialization from streaming results - Includes current_turn for proper state restoration - Complete parity with non-streaming RunResult.to_state() **RunState Enhancements:** 6. **Runtime Imports** (run_state.py:108, 238, 369, 461) - Added runtime imports for NextStepInterruption - Fixes NameError when serializing/deserializing interruptions - Keeps TYPE_CHECKING imports for type hints 7. **from_json() Method** (run_state.py:385-475) - Added from_json() static method for dict deserialization - Complements existing from_string() method - Matches TypeScript API: to_json() / from_json() **Examples:** 8. **human_in_the_loop.py** (examples/agent_patterns/) - Complete non-streaming HITL example - Demonstrates state serialization to JSON file - Shows approve/reject workflow with while loop - Matches TypeScript non-streaming example behavior 9. **human_in_the_loop_stream.py** (examples/agent_patterns/) - Complete streaming HITL example - Uses Runner.run_streamed() for streaming output - Shows streaming with interruption handling - Updated docstring to reflect streaming support - Includes while loop for rejection handling - Matches TypeScript streaming example behavior **Key Design Decisions:** - Kept _start_streaming as @classmethod (existing pattern) - Separate instance/classmethod for tool execution (additive only) - No breaking changes to existing functionality - Complete API parity with TypeScript SDK - Rejection returns error message to LLM for retry - While loops in examples handle rejection/retry flow **Testing:** - ✅ Streaming HITL: interruption, approval, resumption - ✅ Non-streaming HITL: interruption, approval, resumption - ✅ State serialization: to_json() / from_json() - ✅ Tool rejection: message returned, retry possible - ✅ Examples: both streaming and non-streaming work - ✅ Code quality: ruff format, ruff check, mypy pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
diff --git a/examples/agent_patterns/human_in_the_loop.py b/examples/agent_patterns/human_in_the_loop.py
@@ -0,0 +1,137 @@
+"""Human-in-the-loop example with tool approval.
+
+This example demonstrates how to:
+1. Define tools that require approval before execution
+2. Handle interruptions when tool approval is needed
+3. Serialize/deserialize run state to continue execution later
+4. Approve or reject tool calls based on user input
+"""
+
+import asyncio
+import json
+
+from agents import Agent, Runner, RunState, function_tool
+
+
+@function_tool
+async def get_weather(city: str) -> str:
+    """Get the weather for a given city.
+
+    Args:
+        city: The city to get weather for.
+
+    Returns:
+        Weather information for the city.
+    """
+    return f"The weather in {city} is sunny"
+
+
+async def _needs_temperature_approval(_ctx, params, _call_id) -> bool:
+    """Check if temperature tool needs approval."""
+    return "Oakland" in params.get("city", "")
+
+
+@function_tool(
+    # Dynamic approval: only require approval for Oakland
+    needs_approval=_needs_temperature_approval
+)
+async def get_temperature(city: str) -> str:
+    """Get the temperature for a given city.
+
+    Args:
+        city: The city to get temperature for.
+
+    Returns:
+        Temperature information for the city.
+    """
+    return f"The temperature in {city} is 20° Celsius"
+
+
+# Main agent with tool that requires approval
+agent = Agent(
+    name="Weather Assistant",
+    instructions=(
+        "You are a helpful weather assistant. "
+        "Answer questions about weather and temperature using the available tools."
+    ),
+    tools=[get_weather, get_temperature],
+)
+
+
+async def confirm(question: str) -> bool:
+    """Prompt user for yes/no confirmation.
+
+    Args:
+        question: The question to ask.
+
+    Returns:
+        True if user confirms, False otherwise.
+    """
+    # Note: In a real application, you would use proper async input
+    # For now, using synchronous input with run_in_executor
+    loop = asyncio.get_event_loop()
+    answer = await loop.run_in_executor(None, input, f"{question} (y/n): ")
+    normalized = answer.strip().lower()
+    return normalized in ("y", "yes")
+
+
+async def main():
+    """Run the human-in-the-loop example."""
+    result = await Runner.run(
+        agent,
+        "What is the weather and temperature in Oakland?",
+    )
+
+    has_interruptions = len(result.interruptions) > 0
+
+    while has_interruptions:
+        print("\n" + "=" * 80)
+        print("Run interrupted - tool approval required")
+        print("=" * 80)
+
+        # Storing state to file (demonstrating serialization)
+        state = result.to_state()
+        state_json = state.to_json()
+        with open("result.json", "w") as f:
+            json.dump(state_json, f, indent=2)
+
+        print("State saved to result.json")
+
+        # From here on you could run things on a different thread/process
+
+        # Reading state from file (demonstrating deserialization)
+        print("Loading state from result.json")
+        with open("result.json", "r") as f:
+            stored_state_json = json.load(f)
+
+        state = RunState.from_json(agent, stored_state_json)
+
+        # Process each interruption
+        for interruption in result.interruptions:
+            print(f"\nTool call details:")
+            print(f"  Agent: {interruption.agent.name}")
+            print(f"  Tool: {interruption.raw_item.name}")  # type: ignore
+            print(f"  Arguments: {interruption.raw_item.arguments}")  # type: ignore
+
+            confirmed = await confirm("\nDo you approve this tool call?")
+
+            if confirmed:
+                print(f"✓ Approved: {interruption.raw_item.name}")
+                state.approve(interruption)
+            else:
+                print(f"✗ Rejected: {interruption.raw_item.name}")
+                state.reject(interruption)
+
+        # Resume execution with the updated state
+        print("\nResuming agent execution...")
+        result = await Runner.run(agent, state)
+        has_interruptions = len(result.interruptions) > 0
+
+    print("\n" + "=" * 80)
+    print("Final Output:")
+    print("=" * 80)
+    print(result.final_output)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/examples/agent_patterns/human_in_the_loop_stream.py b/examples/agent_patterns/human_in_the_loop_stream.py
@@ -0,0 +1,120 @@
+"""Human-in-the-loop example with streaming.
+
+This example demonstrates the human-in-the-loop (HITL) pattern with streaming.
+The agent will pause execution when a tool requiring approval is called,
+allowing you to approve or reject the tool call before continuing.
+
+The streaming version provides real-time feedback as the agent processes
+the request, then pauses for approval when needed.
+"""
+
+import asyncio
+
+from agents import Agent, Runner, function_tool
+
+
+async def _needs_temperature_approval(_ctx, params, _call_id) -> bool:
+    """Check if temperature tool needs approval."""
+    return "Oakland" in params.get("city", "")
+
+
+@function_tool(
+    # Dynamic approval: only require approval for Oakland
+    needs_approval=_needs_temperature_approval
+)
+async def get_temperature(city: str) -> str:
+    """Get the temperature for a given city.
+
+    Args:
+        city: The city to get temperature for.
+
+    Returns:
+        Temperature information for the city.
+    """
+    return f"The temperature in {city} is 20° Celsius"
+
+
+@function_tool
+async def get_weather(city: str) -> str:
+    """Get the weather for a given city.
+
+    Args:
+        city: The city to get weather for.
+
+    Returns:
+        Weather information for the city.
+    """
+    return f"The weather in {city} is sunny."
+
+
+async def confirm(question: str) -> bool:
+    """Prompt user for yes/no confirmation.
+
+    Args:
+        question: The question to ask.
+
+    Returns:
+        True if user confirms, False otherwise.
+    """
+    loop = asyncio.get_event_loop()
+    answer = await loop.run_in_executor(None, input, f"{question} (y/n): ")
+    return answer.strip().lower() in ["y", "yes"]
+
+
+async def main():
+    """Run the human-in-the-loop example."""
+    main_agent = Agent(
+        name="Weather Assistant",
+        instructions=(
+            "You are a helpful weather assistant. "
+            "Answer questions about weather and temperature using the available tools."
+        ),
+        tools=[get_temperature, get_weather],
+    )
+
+    # Run the agent with streaming
+    result = Runner.run_streamed(
+        main_agent,
+        "What is the weather and temperature in Oakland?",
+    )
+    async for _ in result.stream_events():
+        pass  # Process streaming events silently or could print them
+
+    # Handle interruptions
+    while len(result.interruptions) > 0:
+        print("\n" + "=" * 80)
+        print("Human-in-the-loop: approval required for the following tool calls:")
+        print("=" * 80)
+
+        state = result.to_state()
+
+        for interruption in result.interruptions:
+            print(f"\nTool call details:")
+            print(f"  Agent: {interruption.agent.name}")
+            print(f"  Tool: {interruption.raw_item.name}")  # type: ignore
+            print(f"  Arguments: {interruption.raw_item.arguments}")  # type: ignore
+
+            confirmed = await confirm("\nDo you approve this tool call?")
+
+            if confirmed:
+                print(f"✓ Approved: {interruption.raw_item.name}")
+                state.approve(interruption)
+            else:
+                print(f"✗ Rejected: {interruption.raw_item.name}")
+                state.reject(interruption)
+
+        # Resume execution with streaming
+        print("\nResuming agent execution...")
+        result = Runner.run_streamed(main_agent, state)
+        async for _ in result.stream_events():
+            pass  # Process streaming events silently or could print them
+
+    print("\n" + "=" * 80)
+    print("Final Output:")
+    print("=" * 80)
+    print(result.final_output)
+    print("\nDone!")
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/src/agents/__init__.py b/src/agents/__init__.py
@@ -61,7 +61,7 @@
 from .result import RunResult, RunResultStreaming
 from .run import RunConfig, Runner
 from .run_context import RunContextWrapper, TContext
-from .run_state import NextStepInterruption, RunState
+from .run_state import RunState
 from .stream_events import (
     AgentUpdatedStreamEvent,
     RawResponsesStreamEvent,
diff --git a/src/agents/_run_impl.py b/src/agents/_run_impl.py
@@ -64,6 +64,7 @@
     ModelResponse,
     ReasoningItem,
     RunItem,
+    ToolApprovalItem,
     ToolCallItem,
     ToolCallOutputItem,
     TResponseInputItem,
@@ -922,18 +923,25 @@ async def run_single_tool(
 
         results = await asyncio.gather(*tasks)
 
-        function_tool_results = [
-            FunctionToolResult(
-                tool=tool_run.function_tool,
-                output=result,
-                run_item=ToolCallOutputItem(
-                    output=result,
-                    raw_item=ItemHelpers.tool_call_output_item(tool_run.tool_call, result),
-                    agent=agent,
-                ),
-            )
-            for tool_run, result in zip(tool_runs, results)
-        ]
+        function_tool_results = []
+        for tool_run, result in zip(tool_runs, results):
+            # If result is already a FunctionToolResult (e.g., from approval interruption),
+            # use it directly instead of wrapping it
+            if isinstance(result, FunctionToolResult):
+                function_tool_results.append(result)
+            else:
+                # Normal case: wrap the result in a FunctionToolResult
+                function_tool_results.append(
+                    FunctionToolResult(
+                        tool=tool_run.function_tool,
+                        output=result,
+                        run_item=ToolCallOutputItem(
+                            output=result,
+                            raw_item=ItemHelpers.tool_call_output_item(tool_run.tool_call, result),
+                            agent=agent,
+                        ),
+                    )
+                )
 
         return function_tool_results, tool_input_guardrail_results, tool_output_guardrail_results
 
@@ -1272,6 +1280,9 @@ def stream_step_items_to_queue(
                 event = RunItemStreamEvent(item=item, name="mcp_approval_response")
             elif isinstance(item, MCPListToolsItem):
                 event = RunItemStreamEvent(item=item, name="mcp_list_tools")
+            elif isinstance(item, ToolApprovalItem):
+                # Tool approval items should not be streamed - they represent interruptions
+                event = None
 
             else:
                 logger.warning(f"Unexpected item type: {type(item)}")
diff --git a/src/agents/result.py b/src/agents/result.py
@@ -146,6 +146,7 @@ def to_state(self) -> Any:
                 result = await Runner.run(agent, state)
             ```
         """
+        from ._run_impl import NextStepInterruption
         from .run_state import RunState
 
         # Create a RunState from the current result
@@ -162,6 +163,10 @@ def to_state(self) -> Any:
         state._input_guardrail_results = self.input_guardrail_results
         state._output_guardrail_results = self.output_guardrail_results
 
+        # If there are interruptions, set the current step
+        if self.interruptions:
+            state._current_step = NextStepInterruption(interruptions=self.interruptions)
+
         return state
 
     def __str__(self) -> str:
@@ -392,3 +397,55 @@ async def _await_task_safely(self, task: asyncio.Task[Any] | None) -> None:
             except Exception:
                 # The exception will be surfaced via _check_errors() if needed.
                 pass
+
+    def to_state(self) -> Any:
+        """Create a RunState from this streaming result to resume execution.
+
+        This is useful when the run was interrupted (e.g., for tool approval). You can
+        approve or reject the tool calls on the returned state, then pass it back to
+        `Runner.run_streamed()` to continue execution.
+
+        Returns:
+            A RunState that can be used to resume the run.
+
+        Example:
+            ```python
+            # Run agent until it needs approval
+            result = Runner.run_streamed(agent, "Use the delete_file tool")
+            async for event in result.stream_events():
+                pass
+
+            if result.interruptions:
+                # Approve the tool call
+                state = result.to_state()
+                state.approve(result.interruptions[0])
+
+                # Resume the run
+                result = Runner.run_streamed(agent, state)
+                async for event in result.stream_events():
+                    pass
+            ```
+        """
+        from ._run_impl import NextStepInterruption
+        from .run_state import RunState
+
+        # Create a RunState from the current result
+        state = RunState(
+            context=self.context_wrapper,
+            original_input=self.input,
+            starting_agent=self.last_agent,
+            max_turns=self.max_turns,
+        )
+
+        # Populate the state with data from the result
+        state._generated_items = self.new_items
+        state._model_responses = self.raw_responses
+        state._input_guardrail_results = self.input_guardrail_results
+        state._output_guardrail_results = self.output_guardrail_results
+        state._current_turn = self.current_turn
+
+        # If there are interruptions, set the current step
+        if self.interruptions:
+            state._current_step = NextStepInterruption(interruptions=self.interruptions)
+
+        return state
diff --git a/src/agents/run.py b/src/agents/run.py
diff --git a/src/agents/run_state.py b/src/agents/run_state.py