feat: remove litellm dependency and bridge path by nabinchha · Pull Request #455 · NVIDIA-NeMo/DataDesigner

nabinchha · 2026-03-24T16:02:15Z

📋 Summary

Seventh PR in the model facade overhaul series (plan, architecture notes). Removes the litellm runtime dependency, bridge adapter, and all associated dead code paths. With PR-6 merged, all providers route to native HTTP adapters by default — the LiteLLM bridge was a migration safety net that is no longer needed.

Previous PRs:

PR-1: feat: canonical model client types, protocols, and LiteLLM bridge adapter feat: canonical model client types, protocols, and LiteLLM bridge adapter #359
PR-2: refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter #373
PR-3: feat: Native OpenAI adapter with retry and AIMD throttle infrastructure feat: Native OpenAI adapter with retry and AIMD throttle infrastructure #402
PR-4: feat: Native Anthropic adapter with shared HTTP client infrastructure feat: Native Anthropic adapter with shared HTTP client infrastructure #426
PR-5: feat: Constrain HttpModelClient to single concurrency mode feat: Constrain HttpModelClient to single concurrency mode... #439
PR-6: feat: wire ThrottledModelClient and dual-semaphore scheduler feat: wire ThrottledModelClient and dual-semaphore scheduler #449

🔄 Changes

🗑️ Removed

litellm_bridge.py — bridge adapter wrapping LiteLLM's router behind the ModelClient protocol
litellm_overrides.py — ThreadSafeCache, CustomRouter, apply_litellm_patches, and image URL schema patch
litellm runtime dependency from pyproject.toml and lazy_heavy_imports.py (net ~1,470 lines removed)
DATA_DESIGNER_MODEL_BACKEND env-var escape hatch and _create_bridge_client helper from clients/factory.py
LiteLLM exception match arms and DownstreamLLMExceptionMessageParser from models/errors.py
apply_litellm_patches() call from models/factory.py
flatten_extra_body parameter from TransportKwargs.from_request (only needed by bridge)
Bridge test files: test_litellm_bridge.py, test_litellm_overrides.py
Bridge fixtures: mock_router, bridge_client from clients/conftest.py

🔧 Changed

clients/factory.py — unknown provider_type now raises ValueError (previously fell back to LiteLLM bridge)
models/errors.py — ported context window detail extraction from DownstreamLLMExceptionMessageParser to new _extract_context_window_detail helper in the native ProviderError path; 403 now correctly maps to ModelPermissionDeniedError (was ModelAuthenticationError in LiteLLM era)
benchmark_engine_v2.py — patches OpenAICompatibleClient.completion/acompletion instead of CustomRouter

✨ Added

Full ProviderErrorKind test coverage in test_model_errors.py — parametrized cases for AUTHENTICATION, API_CONNECTION, TIMEOUT, NOT_FOUND, INTERNAL_SERVER, UNPROCESSABLE_ENTITY, API_ERROR, and multimodal BAD_REQUEST
Context window detail tests — with and without OpenAI-style token-count detail
Unknown provider ValueError test in test_factory.py
PR-7 architecture notes

📚 Docs

Cleaned all stale "litellm" / "via LiteLLM" references from docstrings, comments, AGENTS.md, and README.md

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

models/errors.py — _extract_context_window_detail ports parsing logic from the removed DownstreamLLMExceptionMessageParser; verify the native ProviderError.message carries the same detail the old LiteLLM exception did
clients/factory.py — the only behavioral change: unknown provider_type raises ValueError instead of silently falling back to LiteLLM bridge
test_model_errors.py — new parametrized test cases ensure every ProviderErrorKind is covered after the LiteLLM match arms were removed

Test plan

uv run ruff check on all changed source files
uv run pytest tests/engine/models/ — all model error, factory, and parsing tests pass
Verify litellm no longer appears in uv.lock transitive deps
Verify import litellm fails in the venv (not installed)
Smoke test: make test-run-all-examples passes without LiteLLM

🤖 Generated with Cursor

Made with Cursor

- Delete litellm_bridge.py adapter, litellm_overrides.py, and their tests - Remove LiteLLM fallback branch and DATA_DESIGNER_MODEL_BACKEND env var from clients/factory.py; unknown provider_type now raises ValueError - Remove apply_litellm_patches() call from models/factory.py - Remove LiteLLM exception match arms and DownstreamLLMExceptionMessageParser from models/errors.py; port context window detail extraction to _extract_context_window_detail for native ProviderError path - Remove litellm from lazy_heavy_imports.py and pyproject.toml runtime deps - Remove flatten_extra_body parameter from TransportKwargs.from_request - Clean up LiteLLM references in docstrings, comments, and AGENTS.md - Add full ProviderErrorKind test coverage to test_model_errors.py - Update benchmark script to patch OpenAICompatibleClient instead of CustomRouter Made-with: Cursor

greptile-apps · 2026-03-24T16:07:18Z

Greptile Summary

This PR is the seventh in the model-facade overhaul series and completes the LiteLLM removal: it deletes the bridge adapter (litellm_bridge.py), the overrides module (litellm_overrides.py), all LiteLLM exception match arms, the DATA_DESIGNER_MODEL_BACKEND env-var escape hatch, and the litellm runtime dependency from pyproject.toml and uv.lock. Unknown provider_type values now raise DataDesignerError instead of silently falling back to the bridge.

Key changes:

clients/factory.py: routing simplified to "openai" → OpenAICompatibleClient, "anthropic" → AnthropicClient, anything else → DataDesignerError with a clear fix hint.
models/errors.py: LiteLLM match arms removed; context-window token detail extraction ported to _extract_context_window_detail helper; 403 correctly maps to ModelPermissionDeniedError (was ModelAuthenticationError in the LiteLLM era — the new behavior is more semantically correct).
benchmark_engine_v2.py: patches OpenAICompatibleClient.completion/acompletion instead of CustomRouter; the previously flagged tools forwarding regression (PR comment thread) is fixed in this version.
Test coverage: full ProviderErrorKind parametrized coverage added to test_model_errors.py; bridge/env-var tests replaced with test_unknown_provider_type_raises_data_designer_error.
Net effect: ~1,470 lines removed, litellm (and transitive deps distro, fastuuid, etc.) dropped from the lock file, startup cost reduced.

Confidence Score: 5/5

Safe to merge — this PR removes exclusively dead code paths that were already bypassed in production.
All native adapters have been the default since PR-3/PR-4 and are exercised in production. The only behavioral delta (unknown provider_type now raises DataDesignerError instead of silently bridging) is intentional, documented, and covered by a new test. The prior review concern (tools forwarding regression in the benchmark) was fixed in this version. The _extract_context_window_detail port is correct and verified by dedicated tests. No production-impacting issues found.
The only file with a note is plans/343/model-facade-overhaul-pr-7-architecture-notes.md, which has a minor ValueError vs DataDesignerError wording inconsistency — not a runtime concern.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/factory.py	Removes `_create_bridge_client`, `DATA_DESIGNER_MODEL_BACKEND` env-var, and LiteLLM bridge fallback; unknown `provider_type` now raises `DataDesignerError` with a clear message. Docstring and `Raises` section are consistent with the code.
packages/data-designer-engine/src/data_designer/engine/models/errors.py	Removes all LiteLLM exception match arms and `DownstreamLLMExceptionMessageParser`; ports context-window detail extraction to new private `_extract_context_window_detail` helper. Logic is correct — uses case-insensitive search, extracts on original-case text, and handles the " Please reduce " split boundary.
packages/data-designer-engine/tests/engine/models/test_model_errors.py	Comprehensive replacement of LiteLLM exception test cases with `ProviderError`-based parametrized cases covering all `ProviderErrorKind` values; adds dedicated context-window detail tests (with and without OpenAI-style token-count text). Test IDs are descriptive and coverage is complete.
scripts/benchmarks/benchmark_engine_v2.py	Patches `OpenAICompatibleClient.completion/acompletion` instead of `CustomRouter`; introduces `FakeCompletionResponse` wrapping the canonical `ChatCompletionResponse` type; the previously flagged `tools` forwarding regression is now fixed. Class-level patching is correctly restored in the `finally` block.
packages/data-designer-engine/src/data_designer/engine/models/clients/types.py	Removes `flatten_extra_body` parameter and its `False` branch from `TransportKwargs.from_request`; body construction now always merges `extra_body` keys into the top level. Clean simplification with corresponding test removal.
packages/data-designer-engine/tests/engine/models/clients/test_factory.py	Removes all bridge-related tests and env-var override tests; adds `test_unknown_provider_type_raises_data_designer_error` which correctly validates `DataDesignerError` with a regex match.
packages/data-designer-engine/pyproject.toml	Removes `litellm>=1.77.0,<1.80.12` from runtime dependencies; `uv.lock` is updated to drop the `litellm` and its transitive deps (`distro`, `fastuuid`, etc.).
plans/343/model-facade-overhaul-pr-7-architecture-notes.md	New architecture notes document for PR-7. Contains a minor discrepancy: multiple places state that unknown `provider_type` raises a `ValueError`, but the actual code (and its docstring and test) raise `DataDesignerError`. Not a runtime issue, but the notes are slightly stale on this point.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[create_model_client called] --> B{provider_type?}
    B -->|"openai"| C[OpenAICompatibleClient]
    B -->|"anthropic"| D[AnthropicClient]
    B -->|"other / unknown"| E[DataDesignerError raised ✦ NEW]

    C --> F{throttle_manager?}
    D --> F
    F -->|yes| G[ThrottledModelClient wrapper]
    F -->|no| H[Return raw client]

    subgraph REMOVED ["🗑️ Removed paths"]
        R1["DATA_DESIGNER_MODEL_BACKEND=litellm_bridge → LiteLLMBridgeClient"]
        R2["Unknown provider_type → _create_bridge_client fallback"]
    end

    style REMOVED fill:#ffeeee,stroke:#ff6666
    style E fill:#fff3cd,stroke:#ffc107

Prompt To Fix All With AI

This is a comment left during a code review.
Path: plans/343/model-facade-overhaul-pr-7-architecture-notes.md
Line: 2382-2384

Comment:
**Architecture notes say `ValueError`, code raises `DataDesignerError`**

The notes state "unknown `provider_type` values raise a `ValueError`" in multiple places, but the actual implementation in `clients/factory.py` raises `DataDesignerError`, and the test (`test_unknown_provider_type_raises_data_designer_error`) correctly asserts `DataDesignerError`. The function's own `Raises:` docstring also documents `DataDesignerError`. The architecture notes are slightly stale on this point — worth a one-line fix so the planning docs stay accurate for future readers.

```suggestion
After this PR, unknown `provider_type` values raise a
`DataDesignerError` with a clear message listing supported types.
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (4): Last reviewed commit: "fix: address PR-7 review feedback" | Re-trigger Greptile}

scripts/benchmarks/benchmark_engine_v2.py

The old CustomRouter patch forwarded **kwargs (including tools) to _fake_response, but the new OpenAICompatibleClient patch only passed model and messages — silently disabling tool-call simulation in benchmark scenarios that exercise allow_tools. Made-with: Cursor

...es/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_concurrency.py

scripts/benchmarks/benchmark_engine_v2.py

packages/data-designer-engine/tests/engine/models/test_model_errors.py

andreatgretel · 2026-03-24T16:46:57Z

packages/data-designer-engine/src/data_designer/engine/models/errors.py:410-416

def _extract_context_window_detail(error_text: str) -> str | None:
    """Extract the specific token-count detail from an OpenAI-style context window error."""
    try:
        marker = "This model's maximum context length is "
        if marker in error_text:
            detail = error_text.split(marker, 1)[1].split("\n")[0].split(" Please reduce ")[0]
            return f"{marker}{detail}"
    except Exception:
        pass
    return None

suggestion: the except Exception: pass is a faithful port of the old parser, but you could drop the try/except entirely since the if marker in error_text guard already ensures the split will work. just slightly cleaner

...es/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_concurrency.py

nabinchha · 2026-03-24T17:01:18Z

Resolved in 1f0a36d — removed the try/except since the if marker in guard is sufficient, and also made the marker match case-insensitive.

johnnygreco

Review comment on factory.py — see inline.

packages/data-designer-engine/src/data_designer/engine/models/clients/factory.py

andreatgretel

This looks great, approving!

One small follow-up I’d still like to clean up: provider_type is still free text in config and the CLI, so a typo or an old custom value can get saved and only fail later when preview/generation tries to build the client. Not a blocker here, but it’d be nice to validate that earlier in a follow-up PR!

- Return ChatCompletionResponse from benchmark fakes instead of FakeResponse to match the native client contract (facade expects .message, not .choices[0].message) - Add ids= to parametrize block in test_model_errors.py for readability - Remove unnecessary try/except from _extract_context_window_detail; the `if marker in` guard is sufficient - Make context window marker match case-insensitive - Replace stale httpx.AsyncClient callout in async_concurrency.py docstring with generic "async-stateful resources" Made-with: Cursor

nabinchha · 2026-03-24T17:23:04Z

This looks great, approving!

One small follow-up I’d still like to clean up: provider_type is still free text in config and the CLI, so a typo or an old custom value can get saved and only fail later when preview/generation tries to build the client. Not a blocker here, but it’d be nice to validate that earlier in a follow-up PR!

Let's explore this in PR-8.

andreatgretel

🚀

johnnygreco

thanks for jumping on this @nabinchha!!!

krrish-berri-2 · 2026-03-24T19:59:18Z

@nabinchha curious - what was litellm missing to solve your problem well?

nabinchha · 2026-03-25T14:48:02Z

@nabinchha curious - what was litellm missing to solve your problem well?

Hey @krrish-berri-2 — thanks for reaching out, and for building LiteLLM. It was really helpful early in the project. This move started several weeks ago as we worked through what the project needed as it matured:

Concurrency control. We needed adaptive per-provider throttling, which was simpler for us to build directly than to layer onto LiteLLM's router.
Scope mismatch. We only use a slice of what LiteLLM provides, so we were carrying abstraction and maintenance cost for functionality we don't need.
Upgrade friction. The fast release cadence and narrow version pinning meant we were frequently chasing compatibility. We also spent real time debugging regressions, including some memory-related issues during upgrades.
Import cost. LiteLLM added noticeable startup overhead.

For us, the tradeoff shifted once we needed tight control over transport lifecycle and throttling at the adapter level.

krrish-berri-2 · 2026-03-25T18:01:02Z

Hey @nabinchha, thank you for the feedback.

what was the version pinning + memory issues you saw?

Import cost. LiteLLM added noticeable startup overhead.

Do you still see this? I thought we'd made several improvements on this in the past few months.

nabinchha added 2 commits March 24, 2026 09:10

adjust plan

fb7f2fd

nabinchha requested a review from a team as a code owner March 24, 2026 16:02

greptile-apps bot reviewed Mar 24, 2026

View reviewed changes

scripts/benchmarks/benchmark_engine_v2.py Outdated Show resolved Hide resolved

nabinchha commented Mar 24, 2026

View reviewed changes

...es/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_concurrency.py Show resolved Hide resolved

andreatgretel reviewed Mar 24, 2026

View reviewed changes

scripts/benchmarks/benchmark_engine_v2.py Outdated Show resolved Hide resolved

andreatgretel reviewed Mar 24, 2026

View reviewed changes

packages/data-designer-engine/tests/engine/models/test_model_errors.py Show resolved Hide resolved

andreatgretel reviewed Mar 24, 2026

View reviewed changes

...es/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_concurrency.py Outdated Show resolved Hide resolved

nabinchha requested a review from andreatgretel March 24, 2026 17:02

johnnygreco reviewed Mar 24, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/clients/factory.py Show resolved Hide resolved

andreatgretel previously approved these changes Mar 24, 2026

View reviewed changes

nabinchha dismissed andreatgretel’s stale review via 2c82515 March 24, 2026 17:20

nabinchha force-pushed the nmulepati/overhaul-model-facade-guts-pr7 branch from 1f0a36d to 2c82515 Compare March 24, 2026 17:20

nabinchha requested review from andreatgretel and johnnygreco March 24, 2026 17:22

andreatgretel approved these changes Mar 24, 2026

View reviewed changes

johnnygreco approved these changes Mar 24, 2026

View reviewed changes

nabinchha merged commit 1356408 into main Mar 24, 2026
47 checks passed

nabinchha deleted the nmulepati/overhaul-model-facade-guts-pr7 branch March 24, 2026 17:41

Conversation

nabinchha commented Mar 24, 2026

📋 Summary

🔄 Changes

🗑️ Removed

🔧 Changed

✨ Added

📚 Docs

🔍 Attention Areas

Test plan

Uh oh!

greptile-apps bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreatgretel commented Mar 24, 2026

Uh oh!

Uh oh!

nabinchha commented Mar 24, 2026

Uh oh!

johnnygreco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andreatgretel left a comment

Choose a reason for hiding this comment

Uh oh!

nabinchha commented Mar 24, 2026

Uh oh!

andreatgretel left a comment

Choose a reason for hiding this comment

Uh oh!

johnnygreco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

krrish-berri-2 commented Mar 24, 2026

Uh oh!

nabinchha commented Mar 25, 2026

Uh oh!

krrish-berri-2 commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

greptile-apps bot commented Mar 24, 2026 •

edited

Loading