fix: add test_id to normalize_tool_request() to avoid hash collision by msager27 · Pull Request #5233 · llamastack/llama-stack

msager27 · 2026-03-20T21:02:26Z

What does this PR do?

Fix an issue with the recording system in which a hashed recording might collide. Here's some context:

While working on PR #5216, which adds vllm provider coverage (using a qwen35 model) to the responses test suite, I was running some tests and creating recordings using vllm+qwen35. In one case, I noticed an existing recording with gpt was changed. It seems the model generated the same web search query causing the collision. For reference, I'm including the diff of the changed recording below.

I originally included the fix in PR 5216, but that PR will take some time to resolve as it ties into the discussion of how to add vllm+gpu provider coverage upstream. So, I'm submitting the fix separately. If accepted, I believe the new integration test record action would need to be run to update existing recordings.

Note: the other normalize_* functions already include test_id.

diff --git a/tests/integration/responses/recordings/54aa690e31b5c33a0488a5d7403393e5712917253462292829b37b9320d6df82.json b/tests/integration/responses/recordings/54aa690e31b5c33a0488a5d7403393e5712917253462292829b37b9320d6df82.json
index a8e1e861..8f5af108 100644
--- a/tests/integration/responses/recordings/54aa690e31b5c33a0488a5d7403393e5712917253462292829b37b9320d6df82.json
+++ b/tests/integration/responses/recordings/54aa690e31b5c33a0488a5d7403393e5712917253462292829b37b9320d6df82.json
@@ -1,7 +1,7 @@
 {
-  "test_id": "tests/integration/responses/test_tool_responses.py::test_response_non_streaming_web_search[client_with_models-txt=openai/gpt-4o-llama_experts]",
+  "test_id": "tests/integration/responses/test_tool_responses.py::test_response_non_streaming_web_search[openai_client-txt=vllm/Qwen/Qwen3.5-35B-A3B-llama_experts]",
   "request": {
-    "test_id": "tests/integration/responses/test_tool_responses.py::test_response_non_streaming_web_search[client_with_models-txt=openai/gpt-4o-llama_experts]",
+    "test_id": "tests/integration/responses/test_tool_responses.py::test_response_non_streaming_web_search[openai_client-txt=vllm/Qwen/Qwen3.5-35B-A3B-llama_experts]",
     "provider": "tavily",
     "tool_name": "web_search",
     "kwargs": {

Test Plan

Run something like this before the fix:

[Note: vllm-qwen35 isn't checked in, but this is what I ran locally to repro the issue]

uv run ./scripts/integration-tests.sh --stack-config server:ci-tests --setup vllm-qwen35 --inference-mode record --subdirs responses --pattern "test_response_non_streaming_web_search and openai_client and llama_experts"

One of the gpt recordings will get updated.

Rerun after the fix. Rather than updating an existing recording, a new one is created.

… fix_normalize_tool_request

iamemilio · 2026-03-25T13:45:52Z

there is a known flake in the ollama suite btw

# What does this PR do? Updates a few responses integration tests based on testing with vLLM. Some context: I initially tested with vLLM + Qwen3.5 as part of #5216. That PR was more of a staging effort and will be mostly obsoleted by #5297 and get closed. However, there are a few changes from that PR that I've pulled into separate PRs: 1. This PR which makes one of the web search tests more flexible in its validation (plus a couple skips when provider is vllm) 2. #5233   ## Test Plan Rerun the responses web search tests and verify they work as expected  Co-authored-by: Sébastien Han <seb@redhat.com>

fix: add test_id to normalize_tool_request() to avoid hash collision

37e0707

msager27 requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners March 20, 2026 21:02

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 20, 2026

mergify bot and others added 4 commits March 23, 2026 17:49

Merge branch 'main' into fix_normalize_tool_request

8f001b7

test: add new web_search recordings for responses tests

82bdce2

Merge remote-tracking branch 'origin/fix_normalize_tool_request' into…

2864055

… fix_normalize_tool_request

test: add new web_search recordings for azure responses tests

23db95c

iamemilio mentioned this pull request Mar 24, 2026

fix: stabilize recording hash normalization to reduce flaky integration tests #5252

Closed

3 tasks

leseb and others added 3 commits March 24, 2026 17:25

Merge branch 'main' into fix_normalize_tool_request

b139a19

Merge branch 'main' into fix_normalize_tool_request

591543d

Merge branch 'main' into fix_normalize_tool_request

66a0949

leseb and others added 2 commits March 25, 2026 16:43

Merge branch 'main' into fix_normalize_tool_request

d2c2780

Merge branch 'main' into fix_normalize_tool_request

824ceeb

msager27 mentioned this pull request Mar 26, 2026

test: Update responses tests based on vllm testing #5328

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add test_id to normalize_tool_request() to avoid hash collision#5233

fix: add test_id to normalize_tool_request() to avoid hash collision#5233
msager27 wants to merge 10 commits intollamastack:mainfrom
msager27:fix_normalize_tool_request

msager27 commented Mar 20, 2026

Uh oh!

iamemilio commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

msager27 commented Mar 20, 2026

What does this PR do?

Test Plan

Uh oh!

iamemilio commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants