Skip to content

Conversation

@yashwantbezawada
Copy link

Summary

Fixes #9032 - Updates the reasoning model regex pattern to prevent gpt-5-chat from being incorrectly classified as a reasoning model.

Problem

The Azure gpt-5-chat model was being misclassified as a reasoning model, causing it to fail with inappropriate validation requirements:

  • ValueError: reasoning models require passing temperature=1.0 and max_tokens >= 16000
  • Even when complying, Azure API rejects the reasoning parameter with BadRequestError

Root Cause: The regex pattern lacked an end anchor ($), allowing any model starting with gpt-5- to match as a reasoning model.

Changes

Regex Pattern Update

Before:

r"^(?:o[1345]|gpt-5)(?:-(?:mini|nano))?"

After:

r"^(?:o[1345](?:-(?:mini|nano))?(?:-\d{4}-\d{2}-\d{2})?|gpt-5(?:-(?:mini|nano|pro))?)$"

Improvements:

  • ✅ Added $ anchor for exact matching (prevents gpt-5-chat from matching)
  • ✅ Added pro variant support (gpt-5-pro now recognized as reasoning model)
  • ✅ Explicit date suffix support for o-series models (o1-2023-01-01)
  • ✅ Separate handling for o-series vs gpt-5 models

Test Coverage

Added test cases to verify:

  • gpt-5-pro correctly identified as reasoning model
  • gpt-5-chat correctly identified as NON-reasoning model
  • azure/gpt-5-chat correctly identified as NON-reasoning model

Verification

Tested regex pattern against all model variants:

Reasoning Models (should match):

  • o1, o1-mini, o1-nano, o1-2023-01-01, o1-mini-2023-01-01
  • o3, o3-mini, o3-mini-2023-01-01
  • o4, o5
  • gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-pro

Non-Reasoning Models (should NOT match):

  • gpt-5-chat
  • gpt-4, gpt-4o
  • o2, o6
  • Other models

All tests pass correctly.

Impact

Users can now use gpt-5-chat with standard conversational parameters (temperature < 1.0, max_tokens < 16000) without triggering incorrect reasoning model validation.

Updated the reasoning model regex pattern to use exact matching with
end anchor ($) to prevent models like gpt-5-chat from being incorrectly
classified as reasoning models.

Changes:
- Added $ anchor to prevent prefix matching
- Added support for gpt-5-pro as a valid reasoning model variant
- Added explicit date suffix support for o-series models (o1-2023-01-01)
- Added test cases for gpt-5-pro (should match) and gpt-5-chat (should not match)

Previous regex allowed any model starting with "gpt-5-" to match,
causing gpt-5-chat to be misclassified and trigger inappropriate
temperature=1.0 and max_tokens>=16000 requirements.

Fixes stanfordnlp#9032
@chenmoneygithub
Copy link
Collaborator

@yashwantbezawada Thanks for the PR!

Looks like it's already closed by #9033.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Azure gpt-5-chat incorrectly classified as reasoning model

2 participants