fix: add test_id to normalize_tool_request() to avoid hash collision#5233
Open
msager27 wants to merge 10 commits intollamastack:mainfrom
Open
fix: add test_id to normalize_tool_request() to avoid hash collision#5233msager27 wants to merge 10 commits intollamastack:mainfrom
msager27 wants to merge 10 commits intollamastack:mainfrom
Conversation
… fix_normalize_tool_request
3 tasks
Contributor
|
there is a known flake in the ollama suite btw |
leseb
added a commit
that referenced
this pull request
Mar 30, 2026
# What does this PR do? Updates a few responses integration tests based on testing with vLLM. Some context: I initially tested with vLLM + Qwen3.5 as part of #5216. That PR was more of a staging effort and will be mostly obsoleted by #5297 and get closed. However, there are a few changes from that PR that I've pulled into separate PRs: 1. This PR which makes one of the web search tests more flexible in its validation (plus a couple skips when provider is vllm) 2. #5233 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Rerun the responses web search tests and verify they work as expected <!-- For API changes, include: 1. A testing script (Python, curl, etc.) that exercises the new/modified endpoints 3. The output from running your script Example: ```python ... ... ``` Output: ``` <paste actual output here> ``` --> Co-authored-by: Sébastien Han <seb@redhat.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fix an issue with the recording system in which a hashed recording might collide. Here's some context:
While working on PR #5216, which adds vllm provider coverage (using a qwen35 model) to the responses test suite, I was running some tests and creating recordings using vllm+qwen35. In one case, I noticed an existing recording with gpt was changed. It seems the model generated the same web search query causing the collision. For reference, I'm including the diff of the changed recording below.
I originally included the fix in PR 5216, but that PR will take some time to resolve as it ties into the discussion of how to add vllm+gpu provider coverage upstream. So, I'm submitting the fix separately. If accepted, I believe the new integration test record action would need to be run to update existing recordings.
Note: the other normalize_* functions already include test_id.
Test Plan
Run something like this before the fix:
[Note: vllm-qwen35 isn't checked in, but this is what I ran locally to repro the issue]
uv run ./scripts/integration-tests.sh --stack-config server:ci-tests --setup vllm-qwen35 --inference-mode record --subdirs responses --pattern "test_response_non_streaming_web_search and openai_client and llama_experts"
One of the gpt recordings will get updated.
Rerun after the fix. Rather than updating an existing recording, a new one is created.