docs: add native model client dev note by nabinchha · Pull Request #465 · NVIDIA-NeMo/DataDesigner

nabinchha · 2026-03-25T17:48:45Z

📋 Summary

Adds a new dev note covering the native model client layer and its adaptive throttling system (AIMD-based concurrency control).

🔄 Changes

✨ Added

New dev note: owning-the-model-stack.md — covers the native HTTP client architecture, AIMD adaptive throttling, ceiling stabilization, cascade dampening, two-level throttle keying, and the retry boundary design
Architecture diagrams in docs/devnotes/posts/assets/owning-the-model-stack/ (hero image, layer diagram, AIMD concurrency chart, throttle keying diagram, retry boundary diagram)
Author entry for nmulepati in .authors.yml
Nav entry in mkdocs.yml

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

docs/devnotes/posts/owning-the-model-stack.md — new long-form technical content; review for accuracy on AIMD behavior, retry boundary semantics, and configuration parameter descriptions

🤖 Generated with AI

Made with Cursor

greptile-apps · 2026-03-25T17:51:54Z

Greptile Summary

This PR adds a new dev note, owning-the-model-stack.md, documenting the native model client layer that replaced LiteLLM and the AIMD-based adaptive throttling system introduced in Data Designer v0.5.4. It also adds the author entry for nmulepati and the corresponding mkdocs.yml nav entry.

All technical claims were verified against the live source (ThrottleConfig in run_config.py and ThrottleManager in throttle_manager.py): parameter names, defaults (reduce_factor=0.75, additive_increase=1, success_window=25, cooldown_seconds=2.0, ceiling_overshoot=0.10), and all behavioral descriptions match the implementation.
The AIMD mechanics (additive increase, multiplicative decrease, ceiling stabilization, cascade dampening, two-level keying) are accurately described and consistent with the code.
The cascade dampening math ("collapsed from 20 to 4") is numerically correct: 20 × 0.75⁵ floors to 4 with integer intermediate rounding.
The retry boundary section accurately reflects the asymmetry between async mode (full AIMD loop) and sync mode (transport-layer retries), with a clear note that the sync codepath is temporary.
Previously flagged review concerns (inconsistent model name prefix in log examples, duplicate closing phrase) were resolved in prior commits.
Binary image assets for all five diagrams are included.

Confidence Score: 5/5

Documentation-only PR; no runtime code changes. Safe to merge.
All technical claims in the dev note are accurate — verified against the actual ThrottleConfig and ThrottleManager source. The prior review concerns have been resolved. The author entry, nav placement, and image assets are all correctly structured. No blocking issues remain.
No files require special attention.

Important Files Changed

Filename	Overview
docs/devnotes/posts/owning-the-model-stack.md	New long-form dev note covering the native model client architecture, AIMD throttling, ceiling stabilization, cascade dampening, two-level keying, and retry boundary. All technical claims verified against source (`ThrottleConfig`, `ThrottleManager`): parameter names, defaults, and behavioral descriptions are accurate.
docs/devnotes/.authors.yml	Adds `nmulepati` author entry consistent with the existing format in the file.
mkdocs.yml	Adds nav entry for the new dev note in the correct most-recent-first position, consistent with the existing ordering comment.

Sequence Diagram

sequenceDiagram
    participant CG as ColumnGenerator
    participant MF as ModelFacade
    participant TMC as ThrottledModelClient
    participant TM as ThrottleManager
    participant HMC as HttpModelClient
    participant API as Provider API

    CG->>MF: generate(request)
    MF->>TMC: complete(ChatCompletionRequest)
    TMC->>TM: try_acquire(provider, model, domain)
    TM-->>TMC: slot acquired or wait_seconds
    TMC->>HMC: complete(request)
    HMC->>API: HTTP POST via RetryTransport

    alt 200 OK
        API-->>HMC: 200 OK
        HMC-->>TMC: ChatCompletionResponse
        TMC->>TM: release_success()
        TMC-->>MF: ChatCompletionResponse
        MF-->>CG: result
    else 502/503/504 transient error
        API-->>HMC: server error
        HMC->>API: retry with exponential backoff
        API-->>HMC: 200 OK
        HMC-->>TMC: ChatCompletionResponse
        TMC->>TM: release_success()
    else 429 rate limited
        API-->>HMC: 429
        HMC-->>TMC: ProviderError 429
        TMC->>TM: release_rate_limited(retry_after)
        TMC->>TMC: wait cooldown then re-acquire
    end

_{Reviews (3): Last reviewed commit: "update example model name" | Re-trigger Greptile}

docs/devnotes/posts/owning-the-model-stack.md

nabinchha added 4 commits March 24, 2026 17:37

add images

8002c6f

Merge branch 'main' into nmulepati/docs/native-model-client-dev-notes

ac8a1c4

re-ran slopguard

8536e2b

update dev notes

8fc7a0a

nabinchha requested a review from a team as a code owner March 25, 2026 17:48

Merge branch 'main' into nmulepati/docs/native-model-client-dev-notes

efc8e5e

greptile-apps bot reviewed Mar 25, 2026

View reviewed changes

docs/devnotes/posts/owning-the-model-stack.md Outdated Show resolved Hide resolved

docs/devnotes/posts/owning-the-model-stack.md Show resolved Hide resolved

nabinchha added 3 commits March 25, 2026 11:59

address greptile comments

4d841cb

Merge branch 'main' into nmulepati/docs/native-model-client-dev-notes

75bf989

update example model name

12549fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add native model client dev note#465

docs: add native model client dev note#465
nabinchha wants to merge 8 commits intomainfrom
nmulepati/docs/native-model-client-dev-notes

nabinchha commented Mar 25, 2026

Uh oh!

greptile-apps bot commented Mar 25, 2026 •

edited

Loading

Confidence Score: 5/5

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nabinchha commented Mar 25, 2026

📋 Summary

🔄 Changes

✨ Added

🔍 Attention Areas

Uh oh!

greptile-apps bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps bot commented Mar 25, 2026 •

edited

Loading