refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter by nabinchha · Pull Request #373 · NVIDIA-NeMo/DataDesigner

nabinchha · 2026-03-06T00:01:58Z

📋 Summary

Decouples ModelFacade from direct LiteLLM router usage by introducing a ModelClient adapter layer. The facade now operates entirely on canonical request/response types (ChatCompletionRequest, ChatCompletionResponse, etc.) instead of raw LiteLLM objects, making it testable without LiteLLM and preparing for future client backends.

This is PR 2 of the model facade overhaul series, building on the canonical types and LiteLLMBridgeClient introduced in PR 1.

🔄 Changes

✨ Added

clients/factory.py — create_model_client() factory that handles provider resolution, API key setup, and LiteLLM router construction
TransportKwargs — Unified transport preparation that flattens extra_body into top-level kwargs and separates extra_headers
_raise_from_provider_error() in errors.py — Maps canonical ProviderError to DataDesignerError subclasses
extract_message_from_exception_string() for parsing human-readable messages from stringified LiteLLM exceptions
make_stub_completion_response() test helper for creating canonical test fixtures
close()/aclose() lifecycle methods on ModelFacade and ModelRegistry
New test file test_parsing.py for TransportKwargs behavior

🔧 Changed

ModelFacade — Now accepts a ModelClient via constructor injection instead of creating its own CustomRouter. All methods use canonical types (ChatCompletionRequest/Response, EmbeddingRequest/Response, ImageGenerationRequest/Response)
MCPFacade — Operates on canonical ChatCompletionResponse and ToolCall types instead of raw LiteLLM response objects; removed internal tool call normalization (_extract_tool_calls, _normalize_tool_call) since parsing now happens in the client layer
LiteLLMBridgeClient — Uses TransportKwargs.from_request() instead of collect_non_none_optional_fields() for cleaner request forwarding
ProviderError — Refactored from @dataclass to regular Exception subclass for proper exception semantics
Usage tracking — Consolidated three separate tracking methods into single _track_usage() operating on canonical Usage type
Test suite — All facade and MCP tests now use canonical types instead of StubResponse/StubMessage/FakeResponse/FakeMessage; tests mock ModelClient instead of CustomRouter
model_facade_factory — Now creates a ModelClient first, then injects it into ModelFacade

🗑️ Removed

_try_extract_base64() and direct image parsing from ModelFacade (moved to client layer in PR1)
_get_litellm_deployment() from ModelFacade (moved to create_model_client())
collect_non_none_optional_fields() from parsing.py (replaced by TransportKwargs)
Three separate usage tracking methods (_track_token_usage_from_completion, _track_token_usage_from_embedding, _track_token_usage_from_image_diffusion) replaced by unified _track_usage()
StubResponse/FakeResponse usage in tests (replaced by canonical types)
Several removed tests for internal normalization logic that is now handled by the client layer

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

facade.py — Core refactor: constructor signature change (client replaces secret_resolver + internal router), all methods now use canonical types
errors.py — New _raise_from_provider_error() mapping function and ProviderError → Exception refactor
types.py — TransportKwargs design: flattening extra_body vs keeping extra_headers separate
mcp/facade.py — Significant simplification from canonical types; verify the _convert_canonical_tool_calls_to_dicts bridge is correct

🤖 Generated with Claude Code

…acade-guts-pr1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…provements - Wrap all LiteLLM router calls in try/except to normalize raw exceptions into canonical ProviderError at the bridge boundary (blocking review item) - Extract reusable response-parsing helpers into clients/parsing.py for shared use across future native adapters - Add async image parsing path using httpx.AsyncClient to avoid blocking the event loop in agenerate_image - Add retry_after field to ProviderError for future retry engine support - Fix _to_int_or_none to parse numeric strings from providers - Create test conftest.py with shared mock_router/bridge_client fixtures - Parametrize duplicate image generation and error mapping tests - Add tests for exception wrapping across all bridge methods

…larity - Parse RFC 7231 HTTP-date strings in Retry-After header (used by Azure and Anthropic during rate-limiting) in addition to numeric delay-seconds - Clarify collect_non_none_optional_fields docstring explaining why f.default is None is the correct check for optional field forwarding - Add tests for HTTP-date and garbage Retry-After values

- Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE - Handle list-format detail arrays in _extract_structured_message for FastAPI/Pydantic validation errors - Document scope boundary for vision content in collect_raw_image_candidates

…el-facade-guts-pr2

- Replace @DataClass + __post_init__ with explicit __init__ that calls super().__init__ properly, avoiding brittle field-ordering dependency - Store cause via __cause__ only, removing the redundant .cause attr - Update match pattern in handle_llm_exceptions for non-dataclass type - Rename shadowed local `fields` to `optional_fields` in TransportKwargs

greptile-apps · 2026-03-06T00:12:25Z

Greptile Summary

This PR cleanly decouples ModelFacade from LiteLLM by introducing a ModelClient Protocol and a LiteLLMBridgeClient adapter. The facade now works entirely with canonical request/response types (ChatCompletionRequest, EmbeddingRequest, ImageGenerationRequest and their response counterparts), making it independently testable. The refactoring is well-structured and addresses all issues raised in the previous review round.

Key changes:

ModelFacade: Constructor now accepts a ModelClient via dependency injection; all three operation families (completion, embeddings, image_generation) use canonical types; _track_usage() consolidates three separate tracking methods; close()/aclose() lifecycle hooks added.
TransportKwargs: New dataclass in clients/types.py that centralises extra_body flattening and extra_headers separation, used uniformly across all six operation methods in LiteLLMBridgeClient.
MCPFacade: Significantly simplified by removing _extract_tool_calls/_normalize_tool_call; the _execute_tool_calls_from_canonical replacement correctly guards against non-dict json.loads output.
_raise_from_provider_error: New function typed -> NoReturn maps canonical ProviderError kinds to DataDesignerError subclasses; ProviderError refactored from @dataclass to a proper Exception subclass.
Factory: create_model_client() extracts the deployment-construction logic from ModelFacade, keeping the facade provider-agnostic.
Tests: All facade and MCP tests now mock ModelClient instead of CustomRouter and use canonical types; a new test_parsing.py provides comprehensive TransportKwargs and extract_tool_calls coverage.

One area to watch: ModelRegistry.close()/aclose() iterates facades without exception isolation — if one facade's close() raises, subsequent facades will not be closed (see inline comment). This is low risk today since LiteLLMBridgeClient.close() is a no-op, but should be hardened before any future client implementation holds real resources.

Confidence Score: 4/5

This PR is safe to merge — it is a well-executed refactor with no new runtime risks introduced.
The canonical type layer is coherent, the TransportKwargs abstraction is sound, error mapping is complete and correctly typed (-> NoReturn), and all previously raised review issues have been addressed or deferred to PR3 with acknowledgment. The one mild concern — ModelRegistry.close() not isolating per-facade exceptions — is low risk today since LiteLLMBridgeClient.close() is a no-op, and the test suite has been updated comprehensively throughout.
Pay close attention to registry.py when adding any future ModelClient implementation that holds real resources (connections, thread pools), as the close()/aclose() loop does not isolate per-facade exceptions.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/models/facade.py	Core refactor: replaces direct router usage with ModelClient injection; introduces `_build_chat_completion_request`, `_build_embedding_request`, `_build_image_generation_request`, unified `_track_usage`, and new `close`/`aclose` lifecycle methods. All canonical type usages are correct. Unknown kwargs to completion are logged at debug level and routed to `metadata`; embedding and image kwargs are covered by explicitly extracted fields or `extra_body`.
packages/data-designer-engine/src/data_designer/engine/mcp/facade.py	Simplified significantly by adopting canonical types; removes `_extract_tool_calls`/`_normalize_tool_call`, adds `_execute_tool_calls_from_canonical` with correct non-dict guard for `json.loads` output, and `_convert_canonical_tool_calls_to_dicts` bridge. The dict contract between this helper and `_build_assistant_tool_message` is consistent.
packages/data-designer-engine/src/data_designer/engine/models/clients/types.py	New `TransportKwargs` dataclass cleanly centralises `extra_body` flattening and `extra_headers` separation. `_collect_optional_fields` correctly targets `f.default is None` fields; all request types exclusively use `= None` defaults for optional fields, so the pattern is sound. Six new fields (`stop`, `seed`, `response_format`, etc.) added to `ChatCompletionRequest`.
packages/data-designer-engine/src/data_designer/engine/models/clients/factory.py	New factory file correctly extracts `_get_litellm_deployment` + router construction logic from `ModelFacade`, preserving the `"not-used-but-required"` API key fallback. Straightforward and well-typed.
packages/data-designer-engine/src/data_designer/engine/models/errors.py	Adds `ProviderError` case to `handle_llm_exceptions` and new `_raise_from_provider_error` function typed `-> NoReturn`. `parse_api_error` type annotation corrected from `InternalServerError` to `APIError`. Clean and complete error mapping.
packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py	`ProviderError` refactored from `@dataclass` to a proper `Exception` subclass — correct semantics. The `cause` field is now `self.__cause__` (Python exception chaining). New `extract_message_from_exception_string` helper gracefully parses human-readable messages from LiteLLM exception strings.
packages/data-designer-engine/src/data_designer/engine/models/registry.py	Adds `close()`/`aclose()` lifecycle methods that iterate all managed facades. Exception safety concern: if one facade's close raises, subsequent facades are not closed. `LiteLLMBridgeClient.close()` is currently a no-op so this is low-risk today, but warrants attention for future client implementations.
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py	Switches from `collect_non_none_optional_fields` to `TransportKwargs.from_request` across all six operation methods; correctly uses `transport.headers or None` to pass `None` instead of `{}` to LiteLLM. Exception message parsing improved via `extract_message_from_exception_string`.
packages/data-designer-engine/tests/engine/models/clients/test_parsing.py	New test file comprehensively covers `TransportKwargs.from_request` and `extract_tool_calls`. Parametrized edge cases (None/empty `extra_body`, missing tool IDs, `None` arguments) are all well-handled.
packages/data-designer-engine/tests/engine/models/test_facade.py	Tests updated to mock `ModelClient` instead of `CustomRouter`, use canonical types throughout, and remove LiteLLM-specific response fixtures. Coverage is maintained; signal-to-noise ratio improved.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ModelFacade
    participant ModelClient
    participant TransportKwargs
    participant LiteLLMBridgeClient
    participant CustomRouter

    Caller->>ModelFacade: generate(prompt, **kwargs)
    ModelFacade->>ModelFacade: consolidate_kwargs(**kwargs)
    ModelFacade->>ModelFacade: _build_chat_completion_request(messages, kwargs)
    note right of ModelFacade: known fields → ChatCompletionRequest<br/>unknown fields → metadata (debug log)
    ModelFacade->>ModelClient: completion(ChatCompletionRequest)
    ModelClient->>TransportKwargs: from_request(request)
    note right of TransportKwargs: extra_body flattened into body<br/>extra_headers separated
    ModelClient->>LiteLLMBridgeClient: router.completion(model, messages, extra_headers, **body)
    LiteLLMBridgeClient->>CustomRouter: completion(...)
    CustomRouter-->>LiteLLMBridgeClient: litellm.ModelResponse
    LiteLLMBridgeClient->>LiteLLMBridgeClient: parse_chat_completion_response(response)
    LiteLLMBridgeClient-->>ModelClient: ChatCompletionResponse
    ModelClient-->>ModelFacade: ChatCompletionResponse
    ModelFacade->>ModelFacade: _track_usage(response.usage)
    ModelFacade-->>Caller: (parsed_output, messages)

_{Last reviewed commit: a084038}

packages/data-designer-engine/src/data_designer/engine/mcp/facade.py

packages/data-designer-engine/src/data_designer/engine/models/facade.py

packages/data-designer-engine/src/data_designer/engine/models/errors.py

packages/data-designer-engine/src/data_designer/engine/models/facade.py

packages/data-designer-engine/src/data_designer/engine/models/errors.py

packages/data-designer-engine/src/data_designer/engine/mcp/facade.py

packages/data-designer-engine/src/data_designer/engine/models/facade.py

greptile-apps · 2026-03-09T21:52:17Z

packages/data-designer-engine/src/data_designer/engine/models/registry.py

+    def _set_model_configs(self, model_configs: list[ModelConfig] | None) -> None:
+        self._model_configs = {mc.alias: mc for mc in (model_configs or [])}
+
+    def close(self) -> None:
+        """Release resources held by all model facades."""
+        for facade in self._models.values():
+            facade.close()
+
+    async def aclose(self) -> None:


close()/aclose() silently skip remaining facades on error

If any facade's close() (or aclose()) raises an exception, the loop terminates early and all subsequent facades are left unclosed — a resource leak at cleanup time. While LiteLLMBridgeClient.close() is currently a no-op, future client implementations may hold real connections (HTTP pools, sockets, etc.).

Consider using a collect-and-reraise pattern so every facade is given a chance to close:

def close(self) -> None: """Release resources held by all model facades.""" errors = [] for facade in self._models.values(): try: facade.close() except Exception as e: errors.append(e) if errors: raise ExceptionGroup("Errors during ModelRegistry.close()", errors)

The same pattern should be applied to aclose().

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/data-designer-engine/src/data_designer/engine/models/registry.py Line: 203-211 Comment: **`close()`/`aclose()` silently skip remaining facades on error** If any facade's `close()` (or `aclose()`) raises an exception, the loop terminates early and all subsequent facades are left unclosed — a resource leak at cleanup time. While `LiteLLMBridgeClient.close()` is currently a no-op, future client implementations may hold real connections (HTTP pools, sockets, etc.). Consider using a collect-and-reraise pattern so every facade is given a chance to close: ```python def close(self) -> None: """Release resources held by all model facades.""" errors = [] for facade in self._models.values(): try: facade.close() except Exception as e: errors.append(e) if errors: raise ExceptionGroup("Errors during ModelRegistry.close()", errors) ``` The same pattern should be applied to `aclose()`. How can I resolve this? If you propose a fix, please make it concise.

nabinchha and others added 30 commits February 19, 2026 15:50

plans for model facade overhaul

ab30a2d

update plan

43824ea

add review

2a5f1e4

address feedback + add more details after several self reviews

f945d5b

update plan doc

dfa3817

Merge branch 'main' into nm/overhaul-model-facade-guts

5b18f74

address nits

0f449a7

Merge branch 'nm/overhaul-model-facade-guts' into nm/overhaul-model-f…

37f092a

…acade-guts-pr1

Add cannonical objects

08e57f8

Merge branch 'main' into nm/overhaul-model-facade-guts-pr1

3ab18ee

self-review feedback + address

34349c7

add LiteLLMRouter protocol to strongly type bridge router param

6aae4b6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simplify some things

2a53d37

add a protol for http response like object

4e2f3af

move HttpResponse

b1c85f2

update PR-1 architecture notes for lifecycle and router protocol

f6dc769

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into nm/overhaul-model-facade-guts-pr1

b6b4028

Use contextlib to dry out some code

ba22397

Merge branch 'main' into nm/overhaul-model-facade-guts-pr1

c390912

add PR-2 architecture notes for model facade overhaul

828cc49

save progress on pr2

89a6d4e

Merge branch 'main' into nm/overhaul-model-facade-guts-pr1

e527503

Merge branch 'nm/overhaul-model-facade-guts-pr1' into nm/overhaul-mod…

f6fa447

…el-facade-guts-pr2

small refactor

b8579c2

address feedback

61024c0

Merge branch 'nm/overhaul-model-facade-guts-pr1' into nm/overhaul-mod…

d47d508

…el-facade-guts-pr2

Address greptile comment in pr1

49a45ba

nabinchha requested a review from a team as a code owner March 6, 2026 00:01

nabinchha changed the base branch from nm/overhaul-model-facade-guts-pr1 to main March 6, 2026 00:02

Merge branch 'main' into nm/overhaul-model-facade-guts-pr2

8a385ff

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

nabinchha added 2 commits March 6, 2026 09:39

Address greptile feedback

521c1e4

Merge branch 'main' into nm/overhaul-model-facade-guts-pr2

a831d24

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/facade.py Show resolved Hide resolved

packages/data-designer-engine/src/data_designer/engine/models/errors.py Outdated Show resolved Hide resolved

PR feedback

4836e03

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/facade.py Outdated Show resolved Hide resolved

track usage tracking in finally block for images

ae1bf98

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/errors.py Outdated Show resolved Hide resolved

packages/data-designer-engine/src/data_designer/engine/models/errors.py Show resolved Hide resolved

pr feedback

18b9966

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/mcp/facade.py Show resolved Hide resolved

nabinchha added the refactor label Mar 6, 2026

nabinchha self-assigned this Mar 6, 2026

nabinchha added 3 commits March 6, 2026 13:16

Merge branch 'main' into nm/overhaul-model-facade-guts-pr2

25650b0

Merge branch 'main' into nm/overhaul-model-facade-guts-pr2

651813b

Merge branch 'main' into nm/overhaul-model-facade-guts-pr2

bfed5af

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/facade.py Show resolved Hide resolved

Merge branch 'main' into nm/overhaul-model-facade-guts-pr2

c9d6f4c

greptile-apps bot reviewed Mar 9, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/facade.py Show resolved Hide resolved

Merge branch 'main' into nm/overhaul-model-facade-guts-pr2

a084038

greptile-apps bot reviewed Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter#373

refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter#373
nabinchha wants to merge 42 commits intomainfrom
nm/overhaul-model-facade-guts-pr2

nabinchha commented Mar 6, 2026

Uh oh!

greptile-apps bot commented Mar 6, 2026 •

edited

Loading

Confidence Score: 4/5

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nabinchha commented Mar 6, 2026

📋 Summary

🔄 Changes

✨ Added

🔧 Changed

🗑️ Removed

🔍 Attention Areas

Uh oh!

greptile-apps bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps bot commented Mar 6, 2026 •

edited

Loading