Skip to content

docs: update architecture-and-performance.md to reflect AIMD changes#467

Open
nabinchha wants to merge 2 commits intomainfrom
nmulepati/docs/466-architecture-and-performance-aimd
Open

docs: update architecture-and-performance.md to reflect AIMD changes#467
nabinchha wants to merge 2 commits intomainfrom
nmulepati/docs/466-architecture-and-performance-aimd

Conversation

@nabinchha
Copy link
Contributor

📋 Summary

Update the Architecture & Performance doc to account for AIMD adaptive concurrency control and the ColumnWiseDatasetBuilderDatasetBuilder rename.

Closes #466

🔄 Changes

🔧 Changed

  • "What Data Designer Does" — added bullet for automatic rate-limit adaptation
  • "What Data Designer Does NOT Do" — reworded "Rate limit" to "Impose rate limits" with clarification
  • Concurrency Formula — replaced static max_parallel_requests with AIMD-managed current_throttle_limit; explained decrease/increase behavior
  • max_parallel_requests section — reframed as a ceiling; updated guidance; added AIMD-aware tuning tip
  • Common Problems table — updated "Low throughput" row; added "Frequent 429 → recovery cycles" row
  • Tuning Workflow — added AIMD context and new step for checking throttle logs
  • Execution Model admonition — updated from "column-wise dataset generator" to DatasetBuilder (renamed in chore: rename ColumnWiseDatasetBuilder to DatasetBuilder #437)

✨ Added

  • New "Adaptive Throttling (RunConfig)" section documenting ThrottleConfig parameters
  • Sync engine caveats noting AIMD is fully active on the async engine path only

🤖 Generated with AI

Made with Cursor

…ncy control (#466)

Account for the AIMD throttle manager throughout the architecture doc:
concurrency formula, max_parallel_requests guidance, new ThrottleConfig
section, common problems table, and tuning workflow. Add sync-engine
caveats noting AIMD is fully active on the async path.

Made-with: Cursor
@nabinchha nabinchha requested a review from a team as a code owner March 25, 2026 19:36
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 25, 2026

Greptile Summary

This PR updates docs/concepts/architecture-and-performance.md to document the new AIMD adaptive concurrency controller and the ColumnWiseDatasetBuilderDatasetBuilder rename introduced in #437. The additions are technically accurate and well-structured.

Key changes reviewed:

  • AIMD concurrency control — new ThrottleConfig section with all five parameters (reduce_factor, additive_increase, success_window, cooldown_seconds, ceiling_overshoot). Verified against packages/data-designer-config/src/data_designer/config/run_config.py; all default values match exactly.
  • Sync vs async engine caveat — verified against http_model_client.py (strip_rate_limit_codes=False on sync path keeps 429 in the transport retry list; True on async path lets 429 propagate to the AIMD controller). The docs' claim that "AIMD is inactive on the sync path" is correct.
  • Concurrency Formula — replaced the static max_parallel_requests variable with current_throttle_limit (representing DomainThrottleState.current_limit), accurately reflecting runtime AIMD management.
  • DatasetBuilder rename — execution model admonition updated consistently.
  • One minor log-pattern suggestion: the tip section recommends watching for concurrency reduced from X → Y, but the ThrottleManager also emits concurrency reduced to X and concurrency recovered to N variants that users should be aware of.

Confidence Score: 5/5

  • Documentation-only PR; all technical claims verified against the implementation. Safe to merge.
  • All ThrottleConfig defaults and AIMD behavior descriptions are accurate based on code review of throttle_manager.py, run_config.py, http_model_client.py, and retry.py. The sync/async engine caveat is correctly justified by the strip_rate_limit_codes flag. The single P2 comment is a minor wording improvement that doesn't block merge.
  • No files require special attention.

Important Files Changed

Filename Overview
docs/concepts/architecture-and-performance.md Documentation updated to reflect AIMD concurrency control and DatasetBuilder rename. All ThrottleConfig defaults match the actual implementation; sync/async engine caveat is correct based on the strip_rate_limit_codes flag difference between the two paths. One minor log-pattern inaccuracy in the tip section.

Sequence Diagram

sequenceDiagram
    participant DD as DataDesigner
    participant TM as ThrottleManager (AIMD)
    participant HTTP as HTTP Transport
    participant LLM as Inference Server

    Note over DD,LLM: Async engine path — AIMD active

    DD->>TM: acquire_async() [slot request]
    TM-->>DD: slot granted (in_flight < current_limit)
    DD->>HTTP: POST /chat (via _apost)
    HTTP->>LLM: HTTP request
    LLM-->>HTTP: HTTP 429 (rate limited)
    Note over HTTP: strip_rate_limit_codes=True → 429 not retried by transport
    HTTP-->>DD: RateLimitError raised
    DD->>TM: release_rate_limited(retry_after)
    Note over TM: current_limit *= reduce_factor (0.75)<br/>blocked_until = now + cooldown_seconds
    TM-->>DD: limit reduced + cooldown applied

    loop After success_window consecutive successes
        DD->>TM: release_success()
        TM-->>DD: current_limit += additive_increase (up to ceiling)
    end

    Note over DD,LLM: Sync engine path — AIMD inactive

    DD->>TM: acquire_sync() [slot request]
    TM-->>DD: slot granted (fixed at max_parallel_requests)
    DD->>HTTP: POST /chat (via _post_sync)
    HTTP->>LLM: HTTP request
    LLM-->>HTTP: HTTP 429
    Note over HTTP: strip_rate_limit_codes=False → transport retries 429 internally
    HTTP-->>DD: success response (429 hidden)
    DD->>TM: release_success()
    Note over TM: limit unchanged — AIMD never triggered
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: docs/concepts/architecture-and-performance.md
Line: 236

Comment:
**Log pattern misses "reduced to" variant and recovery messages**

The suggested log pattern `concurrency reduced from X → Y` will not match the second reduction log emitted in `throttle_manager.py` when the rate-limit ceiling is first recorded:

```
🪫📉 '<model>' [<domain>] server rate-limited at <N> (server limit ~<M>) — concurrency reduced to <K> (retrying in Zs)
```

That message uses "reduced **to**", not "reduced **from**". Similarly, the recovery path emits `concurrency recovered to N` and `concurrency fully recovered (N)` — both of which look like increases but won't match `concurrency increased from X → Y`.

Broadening the recommended search patterns slightly will help users catch all throttle activity:

```suggestion
    You can observe this in the logs — look for messages containing `concurrency reduced` (decrease events) and `concurrency increased` or `concurrency recovered` (ramp-up events).
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "updates" | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docs: update architecture-and-performance.md to reflect AIMD changes

1 participant