-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[responsesAPI][5] ResponsesParser with tools for full MCP python loop #29798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5ac5265 to
60a350e
Compare
d0f9eb7 to
4260d51
Compare
This reverts commit 38558b1. un-revert some changes Signed-off-by: Andrew Xia <[email protected]> fixes and found some more bugs Signed-off-by: Andrew Xia <[email protected]>
Signed-off-by: Andrew Xia <[email protected]>
4260d51 to
c707a49
Compare
Signed-off-by: Andrew Xia <[email protected]>
03c3f26 to
d136ff7
Compare
|
Documentation preview: https://vllm--29798.org.readthedocs.build/en/29798/ |
Signed-off-by: Andrew Xia <[email protected]>
a57e4d8 to
6cf0d2a
Compare
|
cc @chaunceyjiang @yeqcharlotte ready for review (: |
💡 Codex Reviewvllm/vllm/entrypoints/openai/serving_responses.py Lines 312 to 314 in a57e4d8
create_responses now immediately imports fbvscode and calls vllm/vllm/entrypoints/context.py Lines 281 to 285 in a57e4d8
call_python_tool creates the ResponseFunctionToolCallOutputItem with a new random call_id instead of reusing the call_id from the preceding function_call. When the next turn is rendered, construct_input_messages uses the output item’s call_id as the tool_call_id (see vllm/entrypoints/responses_utils.py:141-146), so the tool result is associated with an id that does not match the assistant’s tool call. This breaks multi‑turn python/mcp tool conversations under ParsableContext because the model cannot link tool output to the original call. (vllm/entrypoints/context.py:281-285) ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
Signed-off-by: Andrew Xia <[email protected]>
d00871e to
1cee382
Compare
Signed-off-by: Andrew Xia <[email protected]>
| reasoning_parser_cls: Callable[[AnyTokenizer], ReasoningParser], | ||
| response_messages: list[ResponseInputOutputItem], | ||
| request: ResponsesRequest, | ||
| tool_parser_cls, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type?
| This function converts parsable context types to harmony and | ||
| back so we can use GPTOSS demo python tool | ||
| """ | ||
| from vllm.entrypoints.context import ParsableContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not import it at the top?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was similar in for HarmonyContext. I think if we move to the top we get a circular import
Signed-off-by: Andrew Xia <[email protected]>
c7d95f9 to
8984744
Compare
chaunceyjiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks~
yeqcharlotte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for all the oai entrypoint logic we add can we introduce some unit tests? also how adapter is it for anthropic apis?
| VLLM_USE_EXPERIMENTAL_PARSER_CONTEXT="1", | ||
| # uncomment for tool calling | ||
| # PYTHON_EXECUTION_BACKEND="dangerously_use_uv", | ||
| PYTHON_EXECUTION_BACKEND="dangerously_use_uv", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh why was this commented before? did it have issues with ci?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i left it there in a previous PR because we didn't have tool calling yet, so it wasn't necessary yet. There weren't any CI issues
| def need_builtin_tool_call(self) -> bool: | ||
| """Return true if the last message is a MCP tool call""" | ||
| last_message = self.parser.response_messages[-1] | ||
| # TODO: figure out which tools are MCP tools | ||
| if ( # noqa: SIM103 | ||
| last_message.type == "function_call" | ||
| and last_message.name in ("code_interpreter", "python") | ||
| ): | ||
| return True | ||
|
|
||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this format is quite bad lol. let's directly check the condition. also should we hardcode "code_interpreter", "python" here? i remember @alecsolder made the changes to centralize all tools to go through mcp tool type.
if xxxx:
return True
return False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was thinking to clean up the code in #29989, which will include browser & container tool if that's okay? This PR is just to complete the ability to call only the python tool lol
I added some unit tests in this PR in tests/entrypoints/openai/test_response_api_parsable_context.py :) |
Purpose
This PR is part 2 for the ResponsesParser, which provides the tool parser for responsesParser and the ability to run a MCP python tool.
Not in this PR
Test Plan
Added unit tests, and tested the following manually:
Minimax M2
Kimi K2