Exclude '*mini' models from prompt_cache_retention #1345

enyst · 2025-12-06T14:47:18Z

Summary

Exclude mini variants from prompt_cache_retention support in model features
Piggyback existing tests to validate mini variants are not covered

Context
Evaluation surfaced failures related to passing prompt_cache_retention to mini variants (e.g. gpt-5-mini / gpt-5.1-codex-mini) causing litellm BadRequest errors. The intended behavior is to avoid sending prompt_cache_retention for these mini models.

Changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py
- supports_prompt_cache_retention now requires model to match GPT-5/GPT-4.1 patterns AND not contain "mini".
tests/sdk/llm/test_responses_parsing_and_kwargs.py
- Updated test_chat_and_responses_options_prompt_cache_retention_gpt_5_plus_and_non_gpt to assert no prompt_cache_retention for mini variants.
tests/sdk/llm/test_model_features.py
- Updated test_prompt_cache_retention_support expectations for mini variants to False.

Validation

Ran pre-commit on changed files: all hooks passed.
Ran targeted tests for the modified areas: passing.
- tests/sdk/llm/test_responses_parsing_and_kwargs.py::test_chat_and_responses_options_prompt_cache_retention_gpt_5_plus_and_non_gpt
- tests/sdk/llm/test_model_features.py::test_prompt_cache_retention_support

Notes

Full test suite has an unrelated import error in tests/github_workflows/test_resolve_model_config.py due to missing helper module; unrelated to this change.

Co-authored-by: openhands [email protected]

@enyst can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:53e95c9-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-53e95c9-python \
  ghcr.io/openhands/agent-server:53e95c9-python

All tags pushed for this build

ghcr.io/openhands/agent-server:53e95c9-golang-amd64
ghcr.io/openhands/agent-server:53e95c9-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:53e95c9-golang-arm64
ghcr.io/openhands/agent-server:53e95c9-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:53e95c9-java-amd64
ghcr.io/openhands/agent-server:53e95c9-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:53e95c9-java-arm64
ghcr.io/openhands/agent-server:53e95c9-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:53e95c9-python-amd64
ghcr.io/openhands/agent-server:53e95c9-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:53e95c9-python-arm64
ghcr.io/openhands/agent-server:53e95c9-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:53e95c9-golang
ghcr.io/openhands/agent-server:53e95c9-java
ghcr.io/openhands/agent-server:53e95c9-python

About Multi-Architecture Support

Each variant tag (e.g., 53e95c9-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 53e95c9-python-amd64) are also available if needed

…djust tests - Update model_features.get_features to skip mini variants - Update tests to piggyback existing coverage and validate mini excluded Co-authored-by: openhands <[email protected]>

github-actions · 2025-12-06T14:50:54Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/llm/utils
model_features.py	35	1	97%	161
TOTAL	12426	5653	54%

…atterns + mini exclusions - Patterns: ['gpt-5', 'gpt-4.1'] with inline doc reference of actual listed models - Exclude all '*mini' in feature gate (covers gpt-5-mini, gpt-5.1-mini, codex-mini) - Extend tests to include explicit gpt-5.1-mini exclusion Co-authored-by: openhands <[email protected]>

… docs; keep other minis excluded - Update feature gate to carve out 'gpt-5.1-codex-mini' - Update tests to expect retention for 5.1-codex-mini Co-authored-by: openhands <[email protected]>

openhands-sdk/openhands/sdk/llm/utils/model_features.py

tests/sdk/llm/test_responses_parsing_and_kwargs.py

…ix E501 - Provide find_models_by_id for tests expecting resolve_model_configs - Wrap long error message to satisfy Ruff E501 Co-authored-by: openhands <[email protected]>

- Test failure was local-only; CI doesn’t run tests/github_workflows in tests.yml - run-eval workflow uses resolve_model_config.py (singular) directly Co-authored-by: openhands <[email protected]>

enyst · 2025-12-06T16:26:34Z

PASS (200) for all documented positives:
openai/gpt-5.1
openai/gpt-5.1-codex
openai/gpt-5.1-codex-mini
openai/gpt-5.1-chat-latest
openai/gpt-5
openai/gpt-5-codex
openai/gpt-4.1

Negative controls:
openai/gpt-5-mini: UNEXPECTED-PASS (200), but response shows prompt_cache_retention: null and status=incomplete with incomplete_details.reason = max_output_tokens. So it didn’t error on the parameter, but the model didn’t produce output, and the retention was not applied.
openai/gpt-5.1-mini: 400 model_not_found (as before)

- Define llm_51_codex_mini before use Co-authored-by: openhands <[email protected]>

…t_cache_retention Co-authored-by: openhands <[email protected]>

enyst · 2025-12-06T16:36:44Z

@xingyaoww Re: the failure in the agent behavior PR, looks like the issue is that some mini models don't support extended cache, while one does (gpt-5.1-codex-mini).

I verified a list of models, all those above that should support it, and tried a few that don't; excluded the one in integration tests too.

Behavior tests

LLM: exclude '*mini' models from prompt_cache_retention support and a…

83a7ef5

…djust tests - Update model_features.get_features to skip mini variants - Update tests to piggyback existing coverage and validate mini excluded Co-authored-by: openhands <[email protected]>

enyst and others added 2 commits December 6, 2025 15:04

LLM: allow gpt-5.1-codex-mini for extended prompt cache retention per…

55d5945

… docs; keep other minis excluded - Update feature gate to carve out 'gpt-5.1-codex-mini' - Update tests to expect retention for 5.1-codex-mini Co-authored-by: openhands <[email protected]>

enyst commented Dec 6, 2025

View reviewed changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py Outdated Show resolved Hide resolved

Update openhands-sdk/openhands/sdk/llm/utils/model_features.py

8071d86

enyst commented Dec 6, 2025

View reviewed changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py Outdated Show resolved Hide resolved

Update openhands-sdk/openhands/sdk/llm/utils/model_features.py

b90d17b

enyst commented Dec 6, 2025

View reviewed changes

tests/sdk/llm/test_responses_parsing_and_kwargs.py Outdated Show resolved Hide resolved

enyst and others added 3 commits December 6, 2025 16:13

Update tests/sdk/llm/test_responses_parsing_and_kwargs.py

0d5ae21

CI: add resolve_model_configs.py shim for GitHub workflow tests and f…

04daca1

…ix E501 - Provide find_models_by_id for tests expecting resolve_model_configs - Wrap long error message to satisfy Ruff E501 Co-authored-by: openhands <[email protected]>

Revert: remove resolve_model_configs shim (out of scope for this PR)

c5a0f3c

- Test failure was local-only; CI doesn’t run tests/github_workflows in tests.yml - run-eval workflow uses resolve_model_config.py (singular) directly Co-authored-by: openhands <[email protected]>

Tests: fix undefined variable in prompt_cache_retention coverage test

f1a6779

- Define llm_51_codex_mini before use Co-authored-by: openhands <[email protected]>

OpenHands deleted a comment from openhands-ai bot Dec 6, 2025

enyst marked this pull request as ready for review December 6, 2025 16:30

enyst requested a review from xingyaoww December 6, 2025 16:30

Tests: explicitly assert gpt-5-mini-2025-08-07 is excluded from promp…

59e5e41

…t_cache_retention Co-authored-by: openhands <[email protected]>

enyst changed the title ~~Exclude '*mini' models from prompt_cache_retention support and adjust tests~~ Exclude '*mini' models from prompt_cache_retention Dec 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exclude '*mini' models from prompt_cache_retention #1345

Exclude '*mini' models from prompt_cache_retention #1345

enyst commented Dec 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst commented Dec 6, 2025

Uh oh!

enyst commented Dec 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Exclude '*mini' models from prompt_cache_retention #1345

Are you sure you want to change the base?

Exclude '*mini' models from prompt_cache_retention #1345

Conversation

enyst commented Dec 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst commented Dec 6, 2025

Uh oh!

enyst commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

enyst commented Dec 6, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Dec 6, 2025 •

edited

Loading

enyst commented Dec 6, 2025 •

edited

Loading