-
Notifications
You must be signed in to change notification settings - Fork 70
Exclude '*mini' models from prompt_cache_retention #1345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…djust tests - Update model_features.get_features to skip mini variants - Update tests to piggyback existing coverage and validate mini excluded Co-authored-by: openhands <[email protected]>
…atterns + mini exclusions - Patterns: ['gpt-5', 'gpt-4.1'] with inline doc reference of actual listed models - Exclude all '*mini' in feature gate (covers gpt-5-mini, gpt-5.1-mini, codex-mini) - Extend tests to include explicit gpt-5.1-mini exclusion Co-authored-by: openhands <[email protected]>
… docs; keep other minis excluded - Update feature gate to carve out 'gpt-5.1-codex-mini' - Update tests to expect retention for 5.1-codex-mini Co-authored-by: openhands <[email protected]>
…ix E501 - Provide find_models_by_id for tests expecting resolve_model_configs - Wrap long error message to satisfy Ruff E501 Co-authored-by: openhands <[email protected]>
- Test failure was local-only; CI doesn’t run tests/github_workflows in tests.yml - run-eval workflow uses resolve_model_config.py (singular) directly Co-authored-by: openhands <[email protected]>
|
PASS (200) for all documented positives: Negative controls: |
- Define llm_51_codex_mini before use Co-authored-by: openhands <[email protected]>
…t_cache_retention Co-authored-by: openhands <[email protected]>
|
@xingyaoww Re: the failure in the agent behavior PR, looks like the issue is that some mini models don't support extended cache, while one does (gpt-5.1-codex-mini). I verified a list of models, all those above that should support it, and tried a few that don't; excluded the one in integration tests too. |
Summary
Context
Evaluation surfaced failures related to passing prompt_cache_retention to mini variants (e.g. gpt-5-mini / gpt-5.1-codex-mini) causing litellm BadRequest errors. The intended behavior is to avoid sending prompt_cache_retention for these mini models.
Changes
Validation
Notes
Co-authored-by: openhands [email protected]
@enyst can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:53e95c9-pythonRun
All tags pushed for this build
About Multi-Architecture Support
53e95c9-python) is a multi-arch manifest supporting both amd64 and arm6453e95c9-python-amd64) are also available if needed