refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter#373
refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter#373
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…provements - Wrap all LiteLLM router calls in try/except to normalize raw exceptions into canonical ProviderError at the bridge boundary (blocking review item) - Extract reusable response-parsing helpers into clients/parsing.py for shared use across future native adapters - Add async image parsing path using httpx.AsyncClient to avoid blocking the event loop in agenerate_image - Add retry_after field to ProviderError for future retry engine support - Fix _to_int_or_none to parse numeric strings from providers - Create test conftest.py with shared mock_router/bridge_client fixtures - Parametrize duplicate image generation and error mapping tests - Add tests for exception wrapping across all bridge methods
…larity - Parse RFC 7231 HTTP-date strings in Retry-After header (used by Azure and Anthropic during rate-limiting) in addition to numeric delay-seconds - Clarify collect_non_none_optional_fields docstring explaining why f.default is None is the correct check for optional field forwarding - Add tests for HTTP-date and garbage Retry-After values
- Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE - Handle list-format detail arrays in _extract_structured_message for FastAPI/Pydantic validation errors - Document scope boundary for vision content in collect_raw_image_candidates
…el-facade-guts-pr2
…el-facade-guts-pr2
- Replace @DataClass + __post_init__ with explicit __init__ that calls super().__init__ properly, avoiding brittle field-ordering dependency - Store cause via __cause__ only, removing the redundant .cause attr - Update match pattern in handle_llm_exceptions for non-dataclass type - Rename shadowed local `fields` to `optional_fields` in TransportKwargs
Greptile SummaryThis PR cleanly decouples Key changes:
One area to watch:
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/models/facade.py | Core refactor: replaces direct router usage with ModelClient injection; introduces _build_chat_completion_request, _build_embedding_request, _build_image_generation_request, unified _track_usage, and new close/aclose lifecycle methods. All canonical type usages are correct. Unknown kwargs to completion are logged at debug level and routed to metadata; embedding and image kwargs are covered by explicitly extracted fields or extra_body. |
| packages/data-designer-engine/src/data_designer/engine/mcp/facade.py | Simplified significantly by adopting canonical types; removes _extract_tool_calls/_normalize_tool_call, adds _execute_tool_calls_from_canonical with correct non-dict guard for json.loads output, and _convert_canonical_tool_calls_to_dicts bridge. The dict contract between this helper and _build_assistant_tool_message is consistent. |
| packages/data-designer-engine/src/data_designer/engine/models/clients/types.py | New TransportKwargs dataclass cleanly centralises extra_body flattening and extra_headers separation. _collect_optional_fields correctly targets f.default is None fields; all request types exclusively use = None defaults for optional fields, so the pattern is sound. Six new fields (stop, seed, response_format, etc.) added to ChatCompletionRequest. |
| packages/data-designer-engine/src/data_designer/engine/models/clients/factory.py | New factory file correctly extracts _get_litellm_deployment + router construction logic from ModelFacade, preserving the "not-used-but-required" API key fallback. Straightforward and well-typed. |
| packages/data-designer-engine/src/data_designer/engine/models/errors.py | Adds ProviderError case to handle_llm_exceptions and new _raise_from_provider_error function typed -> NoReturn. parse_api_error type annotation corrected from InternalServerError to APIError. Clean and complete error mapping. |
| packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py | ProviderError refactored from @dataclass to a proper Exception subclass — correct semantics. The cause field is now self.__cause__ (Python exception chaining). New extract_message_from_exception_string helper gracefully parses human-readable messages from LiteLLM exception strings. |
| packages/data-designer-engine/src/data_designer/engine/models/registry.py | Adds close()/aclose() lifecycle methods that iterate all managed facades. Exception safety concern: if one facade's close raises, subsequent facades are not closed. LiteLLMBridgeClient.close() is currently a no-op so this is low-risk today, but warrants attention for future client implementations. |
| packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py | Switches from collect_non_none_optional_fields to TransportKwargs.from_request across all six operation methods; correctly uses transport.headers or None to pass None instead of {} to LiteLLM. Exception message parsing improved via extract_message_from_exception_string. |
| packages/data-designer-engine/tests/engine/models/clients/test_parsing.py | New test file comprehensively covers TransportKwargs.from_request and extract_tool_calls. Parametrized edge cases (None/empty extra_body, missing tool IDs, None arguments) are all well-handled. |
| packages/data-designer-engine/tests/engine/models/test_facade.py | Tests updated to mock ModelClient instead of CustomRouter, use canonical types throughout, and remove LiteLLM-specific response fixtures. Coverage is maintained; signal-to-noise ratio improved. |
Sequence Diagram
sequenceDiagram
participant Caller
participant ModelFacade
participant ModelClient
participant TransportKwargs
participant LiteLLMBridgeClient
participant CustomRouter
Caller->>ModelFacade: generate(prompt, **kwargs)
ModelFacade->>ModelFacade: consolidate_kwargs(**kwargs)
ModelFacade->>ModelFacade: _build_chat_completion_request(messages, kwargs)
note right of ModelFacade: known fields → ChatCompletionRequest<br/>unknown fields → metadata (debug log)
ModelFacade->>ModelClient: completion(ChatCompletionRequest)
ModelClient->>TransportKwargs: from_request(request)
note right of TransportKwargs: extra_body flattened into body<br/>extra_headers separated
ModelClient->>LiteLLMBridgeClient: router.completion(model, messages, extra_headers, **body)
LiteLLMBridgeClient->>CustomRouter: completion(...)
CustomRouter-->>LiteLLMBridgeClient: litellm.ModelResponse
LiteLLMBridgeClient->>LiteLLMBridgeClient: parse_chat_completion_response(response)
LiteLLMBridgeClient-->>ModelClient: ChatCompletionResponse
ModelClient-->>ModelFacade: ChatCompletionResponse
ModelFacade->>ModelFacade: _track_usage(response.usage)
ModelFacade-->>Caller: (parsed_output, messages)
Last reviewed commit: a084038
packages/data-designer-engine/src/data_designer/engine/mcp/facade.py
Outdated
Show resolved
Hide resolved
packages/data-designer-engine/src/data_designer/engine/models/errors.py
Outdated
Show resolved
Hide resolved
packages/data-designer-engine/src/data_designer/engine/models/facade.py
Outdated
Show resolved
Hide resolved
packages/data-designer-engine/src/data_designer/engine/models/errors.py
Outdated
Show resolved
Hide resolved
| def _set_model_configs(self, model_configs: list[ModelConfig] | None) -> None: | ||
| self._model_configs = {mc.alias: mc for mc in (model_configs or [])} | ||
|
|
||
| def close(self) -> None: | ||
| """Release resources held by all model facades.""" | ||
| for facade in self._models.values(): | ||
| facade.close() | ||
|
|
||
| async def aclose(self) -> None: |
There was a problem hiding this comment.
close()/aclose() silently skip remaining facades on error
If any facade's close() (or aclose()) raises an exception, the loop terminates early and all subsequent facades are left unclosed — a resource leak at cleanup time. While LiteLLMBridgeClient.close() is currently a no-op, future client implementations may hold real connections (HTTP pools, sockets, etc.).
Consider using a collect-and-reraise pattern so every facade is given a chance to close:
def close(self) -> None:
"""Release resources held by all model facades."""
errors = []
for facade in self._models.values():
try:
facade.close()
except Exception as e:
errors.append(e)
if errors:
raise ExceptionGroup("Errors during ModelRegistry.close()", errors)The same pattern should be applied to aclose().
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/models/registry.py
Line: 203-211
Comment:
**`close()`/`aclose()` silently skip remaining facades on error**
If any facade's `close()` (or `aclose()`) raises an exception, the loop terminates early and all subsequent facades are left unclosed — a resource leak at cleanup time. While `LiteLLMBridgeClient.close()` is currently a no-op, future client implementations may hold real connections (HTTP pools, sockets, etc.).
Consider using a collect-and-reraise pattern so every facade is given a chance to close:
```python
def close(self) -> None:
"""Release resources held by all model facades."""
errors = []
for facade in self._models.values():
try:
facade.close()
except Exception as e:
errors.append(e)
if errors:
raise ExceptionGroup("Errors during ModelRegistry.close()", errors)
```
The same pattern should be applied to `aclose()`.
How can I resolve this? If you propose a fix, please make it concise.
📋 Summary
Decouples
ModelFacadefrom direct LiteLLM router usage by introducing aModelClientadapter layer. The facade now operates entirely on canonical request/response types (ChatCompletionRequest,ChatCompletionResponse, etc.) instead of raw LiteLLM objects, making it testable without LiteLLM and preparing for future client backends.This is PR 2 of the model facade overhaul series, building on the canonical types and
LiteLLMBridgeClientintroduced in PR 1.🔄 Changes
✨ Added
clients/factory.py—create_model_client()factory that handles provider resolution, API key setup, and LiteLLM router constructionTransportKwargs— Unified transport preparation that flattensextra_bodyinto top-level kwargs and separatesextra_headers_raise_from_provider_error()inerrors.py— Maps canonicalProviderErrortoDataDesignerErrorsubclassesextract_message_from_exception_string()for parsing human-readable messages from stringified LiteLLM exceptionsmake_stub_completion_response()test helper for creating canonical test fixturesclose()/aclose()lifecycle methods onModelFacadeandModelRegistrytest_parsing.pyforTransportKwargsbehavior🔧 Changed
ModelFacade— Now accepts aModelClientvia constructor injection instead of creating its ownCustomRouter. All methods use canonical types (ChatCompletionRequest/Response,EmbeddingRequest/Response,ImageGenerationRequest/Response)MCPFacade— Operates on canonicalChatCompletionResponseandToolCalltypes instead of raw LiteLLM response objects; removed internal tool call normalization (_extract_tool_calls,_normalize_tool_call) since parsing now happens in the client layerLiteLLMBridgeClient— UsesTransportKwargs.from_request()instead ofcollect_non_none_optional_fields()for cleaner request forwardingProviderError— Refactored from@dataclassto regularExceptionsubclass for proper exception semantics_track_usage()operating on canonicalUsagetypeStubResponse/StubMessage/FakeResponse/FakeMessage; tests mockModelClientinstead ofCustomRoutermodel_facade_factory— Now creates aModelClientfirst, then injects it intoModelFacade🗑️ Removed
_try_extract_base64()and direct image parsing fromModelFacade(moved to client layer in PR1)_get_litellm_deployment()fromModelFacade(moved tocreate_model_client())collect_non_none_optional_fields()fromparsing.py(replaced byTransportKwargs)_track_token_usage_from_completion,_track_token_usage_from_embedding,_track_token_usage_from_image_diffusion) replaced by unified_track_usage()StubResponse/FakeResponseusage in tests (replaced by canonical types)🔍 Attention Areas
facade.py— Core refactor: constructor signature change (clientreplacessecret_resolver+ internal router), all methods now use canonical typeserrors.py— New_raise_from_provider_error()mapping function andProviderError→Exceptionrefactortypes.py—TransportKwargsdesign: flatteningextra_bodyvs keepingextra_headersseparatemcp/facade.py— Significant simplification from canonical types; verify the_convert_canonical_tool_calls_to_dictsbridge is correct🤖 Generated with Claude Code