Skip to content

feat(endpoints): Add OpenAI Responses API endpoint with fixes and integration tests#43

Open
acere wants to merge 2 commits intoawslabs:mainfrom
acere:ResponseAPI
Open

feat(endpoints): Add OpenAI Responses API endpoint with fixes and integration tests#43
acere wants to merge 2 commits intoawslabs:mainfrom
acere:ResponseAPI

Conversation

@acere
Copy link
Copy Markdown
Collaborator

@acere acere commented Mar 25, 2026

Summary

Adds the OpenAI Responses API endpoint support for LLMeter, with fixes to align with the actual API behavior.

Changes

Endpoint fixes (llmeter/endpoints/openai_response.py)

  • Rename max_tokens to max_output_tokens in create_payload (Response API parameter name)
  • Fix _parse_response to handle usage=None (Bedrock Mantle doesn't always return it) and use input_tokens/output_tokens with fallback to prompt_tokens/completion_tokens
  • Rewrite _parse_stream_response to process typed events (response.output_text.delta, response.completed) instead of the old chunk-with-output-array format

Integration tests

  • Add tests/integ/test_response_endpoint.py — integration tests for ResponseEndpoint and ResponseStreamEndpoint wrappers against Bedrock Mantle
  • Fix tests/integ/test_response_bedrock.py to use ResponseUsage attribute names (input_tokens/output_tokens)

Unit test updates

  • Update all unit test mocks across 5 test files to use spec-based usage mocks (input_tokens/output_tokens) and event-based streaming mocks

Example notebook

  • Add examples/LLMeter with OpenAI Response API on Bedrock.ipynb demonstrating non-streaming and streaming usage with Runner and plotting

Testing

  • All 527 unit tests pass
  • Ruff lint clean

acere added 2 commits March 24, 2026 21:41
… test suite

- Add ResponseEndpoint and ResponseStreamEndpoint classes for OpenAI Responses API support
- Implement non-streaming and streaming response handling with proper error management
- Add structured output support with response format validation and serialization
- Create comprehensive unit test suite covering response parsing, error handling, format validation, model parameters, payload parsing, properties, and serialization
- Add integration tests for Bedrock response endpoint functionality
- Export new response endpoint classes from endpoints module
- Update integration test configuration with response endpoint fixtures
- Rename max_tokens to max_output_tokens in create_payload (Response API
  parameter name)
- Fix _parse_response to handle usage=None (Bedrock Mantle) and use
  input_tokens/output_tokens with fallback to prompt_tokens/completion_tokens
- Rewrite _parse_stream_response to process typed events
  (response.output_text.delta, response.completed) instead of the old
  chunk-with-output-array format
- Fix test_response_bedrock.py to use ResponseUsage attribute names
  (input_tokens/output_tokens)
- Add integration tests for ResponseEndpoint and ResponseStreamEndpoint
- Add example notebook for Response API on Bedrock
- Update all unit test mocks to match new behavior
@acere acere requested a review from athewsey March 25, 2026 01:50
@acere acere self-assigned this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant