[Responses API] Sanitize leaked Harmony control tokens in tool names and recipients by will-deines · Pull Request #35906 · vllm-project/vllm

will-deines · 2026-03-03T20:17:10Z

Recreated from #35881, which was closed when the fork was temporarily made private.

Summary

GPT-OSS models leak Harmony protocol control tokens (<|channel|>, <|constrain|>, <|start|>, <|end|>, <|message|>) into tool names and recipient fields during generation. This causes:

Tool name contamination — e.g. manage_cart<|channel|>commentary instead of manage_cart, corrupting function call routing and causing infinite tool-call loops
<|constrain|> as recipient — e.g. <|constrain|>json matches no routing pattern, falls through to MCP handler or raises errors
Missing <|start|> between channels — model omits start token between consecutive outputs, causing StreamableParser to throw HarmonyError
Malformed <|constrain|> in headers — produces garbage in recipient or content_type fields
Free text between channel messages — model outputs trailing plain text after <|end|> before starting the next channel message, causing HarmonyError in EXPECT_START state

Three layers of defense

sanitize_harmony_name() / sanitize_harmony_recipient() — Pure string functions that strip leaked control tokens. sanitize_harmony_name() finds the earliest control token and returns only the text before it. sanitize_harmony_recipient() extends this for dotted recipients (e.g. browser.search) by splitting on ., sanitizing each part individually, and rejoining — preserving the dotted structure while cleaning each component (e.g. browser<|channel|>.search → browser.search). If any dotted component is entirely consumed by control tokens (e.g. functions.<|constrain|>json), the whole recipient is considered corrupt and returns empty string, triggering the safe no-recipient fallback rather than misrouting. Applied at all input parsing, output dispatching, tool routing, and streaming delta extraction sites.
ResilientStreamableParser — Drop-in wrapper around StreamableParser that intercepts three malformed token patterns:
- Pattern 1: Missing <|start|> recovery — when parser expects <|start|> but gets <|channel|>, inject the missing <|start|> + role tokens. Role is tracked dynamically from self._inner.current_role during processing (not hardcoded), so it works correctly for any role.
- Pattern 2: Malformed <|constrain|> in headers — skip tokens until <|message|> or <|end|>
- Pattern 3: Free text between messages — silently discard any token in EXPECT_START state that is not <|start|>. The triggered_tags grammar allows free tokens in the sub-dispatch loop, so the model may generate trailing text after a <|end|> that isn't part of any channel. Discarding these tokens preserves all completed messages while ignoring inter-message garbage.
- last_consumed_token tracking — tracks which tokens were actually forwarded to the inner parser, so that callers (e.g. StreamingHarmonyContext.append_output()) don't record discarded tokens in last_tok. Without this, render_for_completion() could IndexError searching for a token that was never rendered.
Routing-level fallback — After sanitization, if a recipient becomes empty string, treat it as None so it falls through to _parse_message_no_recipient() (produces a user-visible message instead of a misrouted MCP call).

Related Issues & PRs

#	Title	Status	Relation
#32587	Special tokens leak into tool names	Open	Primary bug report for tool name contamination
#30372	Distorted tool names + infinite tool-call loop	Open	Consequence of tool name contamination
#23567	HarmonyError: unexpected tokens in message header	Open	Parser crash from malformed sequences — Pattern 2 + 3 address this
#28262	Incorrect input/output handling in Responses API	Open	Channel metadata loss causing `<\|constrain\|>` misrouting
#31607	Add SM 12.1 support + Fix GPT-OSS Harmony garbled reasoning and HarmonyError crashes	Open	Same class of HarmonyError crashes; our `ResilientStreamableParser` handles the parser-level recovery
#31677	Sanitize malformed tool call recipients (stale PR)	Open	Strips `<\|channel\|>` from recipients; our approach is broader (all token types, structured recipients)
#32633	Fix token leaks in tool names and streaming (stale PR)	Open	Defines sanitize + strip functions; our approach unifies string-level + token-level recovery
#28303	Parse gpt-oss refusals w/ non-strict mode (stale PR)	Open	Different approach via openai-harmony library
#29236	Fix gpt oss tool parser v2 (stale PR)	Open	Also addresses tag sanitization
#34857	Responses API & Tool Calling H1 2026 roadmap	Open	Lists "guided decode and structured outputs" as focus area
#37433	[Responses API] tool_choice support for GPT-OSS	Draft (ours)	Downstream dependent — depends on `ResilientStreamableParser` for reliable parsing of forced tool calls

Decisions to debate

Wrapper vs. monkey-patch for StreamableParser: We chose a wrapper class (ResilientStreamableParser) that delegates all properties to the inner parser, rather than monkey-patching or subclassing. This means get_streamable_parser_for_assistant() returns our wrapper instead of a raw StreamableParser. All existing consumers work unchanged, but isinstance(parser, StreamableParser) checks would fail — we haven't found any such checks in the codebase, but reviewers should flag if they know of one.
String-level vs. token-level sanitization: sanitize_harmony_name() operates on strings, not token IDs. This is intentional — by the time we have a message.recipient or function_name, it's already a string. Token-level recovery is handled separately by ResilientStreamableParser.process(). The two layers are complementary, not redundant.
Hardcoded token IDs (200003, 200005–200008): The ResilientStreamableParser references specific GPT-OSS encoding token IDs. These are stable across the harmony-gpt-oss encoding but would break if a different encoding were used. We could look these up dynamically from the encoding, but the IDs are well-established constants and dynamic lookup adds complexity for no current benefit.
Sanitization applied broadly (defense in depth): We sanitize at input parsing, output dispatch, tool routing, AND streaming — even though the ResilientStreamableParser should catch most issues at the token level. This is intentional defense-in-depth: if a code path bypasses the resilient parser (e.g. direct Message construction in tests or from previous_input_messages), the string-level sanitization still catches leaked tokens.
Empty-after-sanitization → None fallback: When sanitizing a recipient produces an empty string, we convert it to None rather than raising an error. This causes the message to be treated as a "no-recipient" message (preamble), which is the safest fallback — the user sees the text content rather than getting a routing error. This is a design choice that could mask other bugs; an alternative would be to log a warning.
Structured recipient sanitization (sanitize_harmony_recipient): Rather than applying sanitize_harmony_name() to the full dotted string (which would truncate functions.get_weather<|channel|>commentary to just functions), we split on ., sanitize each part, and rejoin. If any component sanitizes to empty (e.g. functions.<|constrain|>json where the second component is entirely a control token), the whole recipient is treated as corrupt and returns empty string — this prevents partial recipients like "functions" from failing startswith("functions.") checks and falling through to incorrect routing. The downside is slightly more complexity; the upside is that dotted names like browser.search or container.exec survive contamination in any component, while fully-corrupt recipients safely fall back to no-recipient handling.

Files changed

File	Change
`vllm/entrypoints/openai/parser/harmony_utils.py`	Add `sanitize_harmony_name()`, `sanitize_harmony_recipient()` (returns empty when any component is fully consumed), `ResilientStreamableParser` (3 patterns + dynamic role tracking + `last_consumed_token` property), wrap `get_streamable_parser_for_assistant()`, sanitize input parsing
`vllm/entrypoints/openai/responses/harmony.py`	Sanitize recipients in output dispatch (`harmony_to_response_output`) + input parsing (`response_input_to_harmony`, `_parse_chat_format_message`) + function name extraction (`_parse_function_call`)
`vllm/entrypoints/openai/responses/context.py`	Sanitize recipients in tool routing (`need_builtin_tool_call`, `call_tool`, `call_search_tool`, `call_container_tool`); fix `append_output()` to only update `last_tok` from consumed tokens; add bounds check in `render_for_completion()` to prevent `IndexError`
`vllm/entrypoints/openai/chat_completion/stream_harmony.py`	Sanitize tool names in streaming delta extraction (`extract_harmony_streaming_delta`)
`tests/entrypoints/openai/parser/test_harmony_utils.py`	`TestSanitizeHarmonyName` (7 cases), `TestSanitizeHarmonyRecipient` (10 cases), `TestResilientStreamableParser` (7 cases incl. message recipient sanitization + `last_consumed_token` tracking)
`tests/entrypoints/openai/responses/test_harmony_utils.py`	`TestHarmonyOutputSanitization` (2 cases: contaminated recipient → message, contaminated function name → cleaned)

Test plan

TestSanitizeHarmonyName — 7 cases: clean passthrough, <|channel|> stripping, <|constrain|> stripping, pure token → empty, multiple tokens → earliest wins, empty input, trailing whitespace
TestSanitizeHarmonyRecipient — 10 cases: clean dotted, clean simple, contaminated first part, contaminated second part, pure control token, functions dotted contaminated, empty string, container dotted contaminated, full component contamination returns empty, container full component contamination returns empty
TestResilientStreamableParser — 7 cases: normal sequence unchanged, missing <|start|> recovery, <|constrain|> in header skip, message recipients sanitized, last_consumed_token tracks normal processing, Pattern 3 discarded tokens not in last_consumed_token, Pattern 2 skip mode discarded tokens not in last_consumed_token
TestHarmonyOutputSanitization — 2 cases: <|constrain|>json recipient → message output, contaminated function name → cleaned
All existing parser and responses unit tests pass (no regressions)
Integration test with live GPT-OSS model (needs model access)

gemini-code-assist

Code Review

This pull request introduces a robust, multi-layered defense mechanism to sanitize leaked Harmony protocol control tokens, which is a significant improvement for handling malformed outputs from GPT-OSS models. The introduction of sanitize_harmony_name and the ResilientStreamableParser are well-designed solutions. My review includes a couple of suggestions to further enhance the robustness of the new parser and ensure the completeness of the test suite.

gemini-code-assist · 2026-03-03T20:20:17Z

vllm/entrypoints/openai/parser/harmony_utils.py

+        if state == StreamState.EXPECT_START and token_id == _TOK_CHANNEL:
+            # Inject <|start|> + assistant role token
+            self._inner.process(_TOK_START)
+            role_tokens = self._encoding.encode("assistant", allowed_special="all")


In ResilientStreamableParser.process, the role "assistant" is hardcoded when injecting a missing <|start|> token. This makes the wrapper less reusable and tightly coupled to being used for the assistant role only. To improve robustness, you should use the role attribute from the wrapped StreamableParser instance (self._inner.role).

Suggested change

role_tokens = self._encoding.encode("assistant", allowed_special="all")

role_tokens = self._encoding.encode(self._inner.role, allowed_special="all")

Good point — now uses self._inner.role instead of the hardcoded string. Fixed in 6ef1b2f.

gemini-code-assist · 2026-03-03T20:20:17Z

tests/entrypoints/openai/parser/test_harmony_utils.py

+        assert len(parser.messages) == 2
+        assert parser.messages[0].content[0].text == "First."


The test test_constrain_in_header_skipped verifies that two messages are produced after recovering from a malformed sequence, but it only asserts the content of the first message. To ensure the recovery logic is fully correct and the second message is parsed as expected, you should also add an assertion for the content of the second message.

Suggested change

assert len(parser.messages) == 2

assert parser.messages[0].content[0].text == "First."

assert len(parser.messages) == 2

assert parser.messages[0].content[0].text == "First."

assert parser.messages[1].content[0].text == "Second."

Added the missing assertion for the second message content. Fixed in 6ef1b2f.

mergify · 2026-03-18T13:06:22Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @will-deines.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…rmony_name - ruff: reformat harmony.py (auto-fixed) - Add sanitize_harmony_name() and _HARMONY_SPECIAL_TOKEN_STRS to harmony_utils.py so the import in harmony.py resolves. This duplicates the definition that PR vllm-project#35906 (harmony-token-sanitization) adds; the duplicate will be removed on rebase once that PR merges. - Add sanitize_harmony_name to harmony.py import block Signed-off-by: Will Deines <will@garr.io>

…and recipients GPT-OSS models generate Harmony protocol control tokens (<|channel|>, <|constrain|>, <|start|>, <|end|>, <|message|>) in unexpected positions during output generation, causing tool name contamination, recipient misrouting, and parser crashes. Three layers of defense: 1. sanitize_harmony_name() — pure string function that strips leaked control token strings from tool/recipient names. 2. ResilientStreamableParser — wrapper around StreamableParser that recovers from missing <|start|> tokens between messages and malformed <|constrain|> tokens in headers. 3. Routing-level fallback — sanitized-to-empty recipients fall through to _parse_message_no_recipient() instead of being misrouted. Applied at all input parsing, output dispatching, tool routing, and streaming delta extraction sites. Signed-off-by: Will Deines <will@garr.io>

…ents, remove redundancy - Add sanitize_harmony_recipient() that splits on '.', sanitizes each part, and rejoins to preserve dotted structure (e.g. browser<|channel|>.search becomes browser.search instead of being truncated to browser) - Sanitize recipients on messages returned by ResilientStreamableParser.messages to prevent control token injection in multi-turn conversation history - Remove redundant sanitization in parser_state_to_response_output since ResilientStreamableParser.current_recipient already handles it - Use sanitize_harmony_recipient for full recipient strings in context.py and harmony.py routing logic Signed-off-by: Will Deines <will@garr.io>

…line Signed-off-by: Will Deines <will@garr.io>

… test assertion - Use self._inner.role instead of hardcoded "assistant" in ResilientStreamableParser.process for correctness with non-assistant roles - Add assertion for second message content in test_constrain_in_header_skipped Signed-off-by: Will Deines <will@garr.io>

The triggered_tags grammar's sub-dispatch loop allows all tokens between triggered tags. The model can generate trailing text after a <|end|> before EOS (e.g. restating the answer as plain text after a tool call). These free-text tokens arrive in EXPECT_START state, causing HarmonyError. Add Pattern 3 to ResilientStreamableParser: silently discard any token in EXPECT_START state that is not <|start|>. This preserves all completed messages while ignoring inter-message garbage tokens. Signed-off-by: Will Deines <will@garr.io>

…ttribute Track inner parser's current_role during process() calls and use the cached value for Pattern 1 recovery, fixing the broken self._inner.role reference. Pattern 1 only fires after <|end|> (EXPECT_START state), so at least one message has been processed and _last_known_role is guaranteed non-None. Signed-off-by: Will Deines <will@garr.io>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b871dfc6e6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-18T17:13:47Z

vllm/entrypoints/openai/parser/harmony_utils.py

+        if state == StreamState.EXPECT_START and token_id != _TOK_START:
+            return


Avoid dropping EXPECT_START garbage without fixing last_tok

When a streaming Harmony response emits stray plain text after a completed message, this branch discards the token so it never becomes part of parser.messages, but StreamingHarmonyContext.append_output() still records that raw token as self.last_tok. On the next built-in-tool turn, vllm/entrypoints/openai/responses/context.py:920-924 walks backward through the re-rendered prompt until it finds last_tok; because the dropped token is no longer present, that loop runs off the front of rendered_tokens and raises IndexError. So the new Pattern 3 recovery path can crash exactly the malformed outputs it is supposed to tolerate.

Useful? React with 👍 / 👎.

Good catch — validated and fixed in 8c100e4.

P1 fix: ResilientStreamableParser now tracks _last_consumed_token, updated only when a token is actually forwarded to the inner parser (not on Pattern 2/3 discards). StreamingHarmonyContext.append_output() reads parser.last_consumed_token instead of the raw loop variable. Also added a bounds check in render_for_completion() as a safety net. Three new tests cover normal tracking, Pattern 3 discard, and Pattern 2 skip mode.

chatgpt-codex-connector · 2026-03-18T17:13:47Z

vllm/entrypoints/openai/parser/harmony_utils.py

+    sanitized_parts = [p for p in sanitized_parts if p]
+    return ".".join(sanitized_parts)


Return empty recipient when a dotted component sanitizes away

If a leaked control token wipes out an entire dotted component, for example functions.<|constrain|>json or container.<|channel|>commentary, filtering empty parts here turns the recipient into functions or container instead of an empty string. Downstream code then treats those as real recipients: harmony_to_response_output() falls through to _parse_mcp_call for functions, while HarmonyContext.need_builtin_tool_call() no longer recognizes container as a tool invocation. That means malformed tool calls are silently misrouted instead of hitting the intended empty-recipient fallback.

Useful? React with 👍 / 👎.

Good catch — validated and fixed in 8c100e4.

P2 fix: sanitize_harmony_recipient() now returns "" when any dotted component sanitizes to empty, instead of filtering out empty parts and rejoining. This ensures functions.<|constrain|>json becomes "" (not "functions"), correctly triggering the no-recipient fallback instead of misrouting to _parse_mcp_call(). Two new tests cover functions.<|constrain|>json and container.<|channel|>commentary.

…x recipient misrouting Bug 1: ResilientStreamableParser.process() silently discards tokens in Pattern 2 (skip mode) and Pattern 3 (free text in EXPECT_START), but StreamingHarmonyContext.append_output() unconditionally set last_tok to the most recent token. If that token was discarded, render_for_completion() would fail with IndexError searching for it. Now track last_consumed_token in the parser and only update last_tok when a token was actually forwarded. Also add a bounds check in render_for_completion() as a safety net. Bug 2: sanitize_harmony_recipient() filtered out empty parts after sanitization, collapsing e.g. "functions.<|constrain|>json" to "functions" (bare), which failed startswith("functions.") checks and fell through to incorrect routing. Now return empty string when any component sanitizes to empty, triggering the safe no-recipient fallback. Signed-off-by: Will Deines <will@garr.io>

…ization Signed-off-by: Will Deines <will@garr.io>

mergify bot added frontend gpt-oss Related to GPT-OSS models labels Mar 3, 2026

github-project-automation bot added this to gpt-oss Issues & Enhancements Mar 3, 2026

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Mar 3, 2026

gemini-code-assist bot reviewed Mar 3, 2026

View reviewed changes

will-deines pushed a commit to will-deines/vllm that referenced this pull request Mar 3, 2026

merge: harmony token sanitization (PR vllm-project#35906)

7151ae5

will-deines force-pushed the harmony-token-sanitization branch from 6ef1b2f to 1f6f5f7 Compare March 4, 2026 20:12

will-deines mentioned this pull request Mar 17, 2026

[Harmony] Fix analysis-channel tool calls and preserve reasoning across turns #35907

Open

4 tasks

mergify bot added the needs-rebase label Mar 18, 2026

will-deines mentioned this pull request Mar 18, 2026

[Responses API] tool_choice support (auto / required / none) for GPT-OSS #37433

Open

12 tasks

will-deines force-pushed the harmony-token-sanitization branch from 0968fea to 80c897a Compare March 18, 2026 13:15

garrio-1 added 5 commits March 18, 2026 10:00

Fix pre-commit formatting: import order, line length, trailing blank …

41170ec

…line Signed-off-by: Will Deines <will@garr.io>

will-deines force-pushed the harmony-token-sanitization branch from 80c897a to 714ad90 Compare March 18, 2026 14:01

mergify bot removed the needs-rebase label Mar 18, 2026

will-deines and others added 2 commits March 18, 2026 10:58

Merge branch 'main' into harmony-token-sanitization

f12639b

will-deines marked this pull request as ready for review March 18, 2026 17:07

will-deines requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang, robertgshaw2-redhat and russellb as code owners March 18, 2026 17:07

chatgpt-codex-connector bot reviewed Mar 18, 2026

View reviewed changes

will-deines force-pushed the harmony-token-sanitization branch from 8c100e4 to 3ce8294 Compare March 18, 2026 18:32

robinnarsinghranabhat mentioned this pull request Mar 19, 2026

feat: reasoning output responses api llamastack/llama-stack#5206

Open

Merge remote-tracking branch 'upstream/main' into harmony-token-sanit…

f475a89

…ization Signed-off-by: Will Deines <will@garr.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Responses API] Sanitize leaked Harmony control tokens in tool names and recipients#35906

[Responses API] Sanitize leaked Harmony control tokens in tool names and recipients#35906
will-deines wants to merge 9 commits intovllm-project:mainfrom
will-deines:harmony-token-sanitization

will-deines commented Mar 3, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 3, 2026

Uh oh!

will-deines Mar 3, 2026

Uh oh!

gemini-code-assist bot Mar 3, 2026

Uh oh!

will-deines Mar 3, 2026

Uh oh!

mergify bot commented Mar 18, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 18, 2026

Uh oh!

will-deines Mar 18, 2026

Uh oh!

chatgpt-codex-connector bot Mar 18, 2026

Uh oh!

will-deines Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	role_tokens = self._encoding.encode("assistant", allowed_special="all")
	role_tokens = self._encoding.encode(self._inner.role, allowed_special="all")

		assert len(parser.messages) == 2
		assert parser.messages[0].content[0].text == "First."

		if state == StreamState.EXPECT_START and token_id != _TOK_START:
		return

		sanitized_parts = [p for p in sanitized_parts if p]
		return ".".join(sanitized_parts)

Uh oh!

Conversation

will-deines commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Three layers of defense

Related Issues & PRs

Decisions to debate

Files changed

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

will-deines Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

will-deines Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Mar 18, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

will-deines Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

will-deines Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

will-deines commented Mar 3, 2026 •

edited

Loading