Skip to content

Conversation

@abinashkarki
Copy link

  • Handle None model name in sub-call logging (verbose.py)
  • Accept both repl and python code block tags (parsing.py)
  • Handle special tokens around FINAL statements (parsing.py)

These changes improve compatibility with various LLM backends that use different output formats (e.g., Gemini, some OpenRouter models).

@alexzhang13
Copy link
Owner

So I don't enable python tags because 1) it affects the prompting and 2) I wanted it to be distinct from when the model wants to output python code for whatever reason, e.g. in a task that requires outputting Python code.

The None call was handled in a separate PR, thanks though!

I can merge this if you revert the above 2, "Handle special tokens around FINAL statements (parsing.py)" is fine. I will think on it though if it's necessary. Thanks!

Improves compatibility with models that wrap FINAL/FINAL_VAR in special tokens like <|begin_of_box|>FINAL(...)<|end_of_box|>.
@abinashkarki abinashkarki force-pushed the fix/model-compatibility branch from 6b1416a to 6a38945 Compare January 5, 2026 03:36
@abinashkarki
Copy link
Author

Thanks for the feedback! I've reverted the python code block tag support and the None model handling (didn't realize it was already fixed in #5).

The updated PR now only includes the special token handling for FINAL statements, which helps with models that wrap output in tokens like <|begin_of_box|>FINAL(...)<|end_of_box|>.

@alexzhang13
Copy link
Owner

@abinashkarki Returning to this, the only problem now is if the model is thinking / doing some kind of CoT and is saying "I will then do FINAL(...)...". We don't want to accept these cases -- I think we should just be stricter about what exactly we accept, e.g. maybe just some trivial special tokens in case this is something people are observing.

- Only accept FINAL at start of line (with optional whitespace)
- Or immediately after special tokens like <|begin_of_box|>
- Rejects mid-sentence usage like 'I will then do FINAL(...)'
- Preserves GLM-4 compatibility which wraps FINAL in special tokens
@abinashkarki
Copy link
Author

Thanks for the feedback @alexzhang13 Updated the regex to be stricter.

Why this matters: zai-org/glm-4.6v-flash wraps FINAL in special tokens like <|begin_of_box|>FINAL(4)<|end_of_box|>. Without handling this, benchmarks fail.

What changed: FINAL is now only accepted at:

  • Start of line (with optional whitespace)
  • After special tokens <|...|>

This rejects CoT false positives like "I will do FINAL(42)" while still accepting valid cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants