Skip to content

fix: respect max_parallel_requests in HTTP connection pool size#460

Open
przemekboruta wants to merge 3 commits intoNVIDIA-NeMo:mainfrom
przemekboruta:fix/connection-pool-max-parallel-requests
Open

fix: respect max_parallel_requests in HTTP connection pool size#460
przemekboruta wants to merge 3 commits intoNVIDIA-NeMo:mainfrom
przemekboruta:fix/connection-pool-max-parallel-requests

Conversation

@przemekboruta
Copy link

@przemekboruta przemekboruta commented Mar 25, 2026

Summary

Fixes #459

  • Pass a pre-configured httpx.HTTPTransport / httpx.AsyncHTTPTransport with the correct limits into RetryTransport instead of letting it create its own pool with httpx defaults (100 connections)
  • Add transport parameter to create_retry_transport and forward it to RetryTransport
  • Remove the now-ineffective limits= argument from httpx.Client / httpx.AsyncClient constructors (httpx silently ignores limits when a custom transport is provided)

Root cause

self._limits was calculated correctly from max_parallel_requests but passed to httpx.Client(limits=...), which silently ignores it when a custom transport= is provided. RetryTransport was then creating its own internal HTTPTransport with httpx defaults (max_connections=100), regardless of the configured value.

Test plan

  • test_create_retry_transport_forwards_sync_transport — provided HTTPTransport is stored as _sync_transport
  • test_create_retry_transport_forwards_async_transport — provided AsyncHTTPTransport is stored as _async_transport
  • test_create_retry_transport_no_transport_creates_defaults — default behaviour unchanged when no transport is passed
  • test_sync_client_pool_size_respects_max_parallel_requests — regression: pool has 600 connections for max_parallel_requests=300
  • test_async_client_pool_size_respects_max_parallel_requests — same for async client

🤖 Generated with Claude Code

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 25, 2026

Greptile Summary

This PR fixes a connection pool sizing bug (#459) where max_parallel_requests was silently ignored. The root cause: httpx.Client(limits=...) discards the limits argument when a custom transport= is provided, so RetryTransport was always creating its own pool with httpx defaults (100 connections) instead of the configured value.

The fix is correct and minimal — create httpx.HTTPTransport / httpx.AsyncHTTPTransport with the right Limits first, then pass that pre-configured transport into RetryTransport, bypassing the ignored limits= path entirely. A new limits property on HttpModelClient is a nice addition for observability and testing.

Key points:

  • The production-code change in http_model_client.py and retry.py is sound and addresses the root cause cleanly.
  • The single regression test added (test_client_limits_respect_max_parallel_requests) only asserts client.limits.max_connections == 600, which checks the _limits attribute — a value that was computed correctly before the fix too. It does not verify that the limits actually flow to the transport's httpcore connection pool, so it would pass on the unpatched code as well.
  • The PR description lists five test-plan items (all unchecked), none of which are present in the diff. The missing pool-depth assertions (_sync_transport._pool._max_connections) are the ones that would give real regression protection.

Confidence Score: 4/5

  • Safe to merge — the production fix is correct; one targeted follow-up remains to add a genuine pool-depth regression test.
  • The core change correctly solves the documented root cause and is well-scoped. The only gap is that the regression test verifies an intermediate value (_limits) rather than the actual pool depth, meaning the bug could silently regress. Adding a test that walks _transport._sync_transport._pool._max_connections (and the async equivalent) would bring this to a 5.
  • packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py — the regression test needs to assert on the transport pool's actual _max_connections, not just the limits property.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py Core fix: creates HTTPTransport/AsyncHTTPTransport with explicit limits before wrapping in RetryTransport; exposes a limits property for testing. Logic is sound.
packages/data-designer-engine/src/data_designer/engine/models/clients/retry.py Adds transport kwarg to create_retry_transport and forwards it to RetryTransport; uses TYPE_CHECKING guard to avoid a heavy import at runtime. Clean, minimal change.
packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py Adds one regression test that only checks the _limits property (correct before and after the fix), not that limits flow to the transport pool. The 4 additional tests described in the PR test plan are absent.

Sequence Diagram

sequenceDiagram
    participant App
    participant HttpModelClient
    participant HTTPTransport as httpx.HTTPTransport(limits)
    participant RetryTransport
    participant Pool as httpcore.ConnectionPool

    Note over App,Pool: Before fix — limits silently ignored
    App->>HttpModelClient: _get_sync_client()
    HttpModelClient->>RetryTransport: RetryTransport(retry=...)
    RetryTransport->>Pool: ConnectionPool(max_connections=100) ← default!
    HttpModelClient->>HttpModelClient: httpx.Client(transport=RT, limits=Limits(600)) ← ignored

    Note over App,Pool: After fix — limits flow to pool
    App->>HttpModelClient: _get_sync_client()
    HttpModelClient->>HTTPTransport: HTTPTransport(limits=Limits(600))
    HTTPTransport->>Pool: ConnectionPool(max_connections=600) ✓
    HttpModelClient->>RetryTransport: RetryTransport(transport=HTTPTransport, retry=...)
    HttpModelClient->>HttpModelClient: httpx.Client(transport=RetryTransport)
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py
Line: 299-311

Comment:
**Regression test doesn't verify the actual fix**

`client.limits.max_connections` reads `self._limits` — a value that was computed correctly even *before* this PR. The bug was that `_limits` was never forwarded to the underlying `httpcore` connection pool. This test passes on the old (broken) code just as easily as on the new code, so it provides no regression protection.

To actually verify the fix, the test needs to trigger lazy transport initialization and then inspect the pool's `_max_connections`. The PR description lists exactly these missing tests (all four checkboxes are unchecked):
- `test_sync_client_pool_size_respects_max_parallel_requests`
- `test_async_client_pool_size_respects_max_parallel_requests`
- `test_create_retry_transport_forwards_sync_transport`
- `test_create_retry_transport_forwards_async_transport`

The existing test is fine as a unit-level sanity check for the `limits` property, but it should not be the sole regression guard for this bug.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (3): Last reviewed commit: "refactor: address review feedback — publ..." | Re-trigger Greptile

przemekboruta added a commit to przemekboruta/DataDesigner that referenced this pull request Mar 25, 2026
Address Greptile P2 review comments on PR NVIDIA-NeMo#460:
- Add docstring entry for the new `transport` parameter in
  `create_retry_transport` explaining accepted types and None default
- Add inline comments in pool-size regression tests explaining the
  private attribute chain (_sync/_async_transport → _pool → _max_connections)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@przemekboruta przemekboruta requested a review from a team as a code owner March 25, 2026 15:23
Copy link
Contributor

@nabinchha nabinchha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @przemekboruta — great catch, and thanks for opening both the issue and the PR! Please see findings below that need to be addressed before we can merge.

Critical — Must fix before merge

test_retry.py:87-109 — Tests reach into third-party private attributes; import httpx inside function bodies

  • What: The three new tests (test_create_retry_transport_forwards_sync_transport, test_create_retry_transport_forwards_async_transport, test_create_retry_transport_no_transport_creates_defaults) assert on _sync_transport and _async_transport — private attributes of RetryTransport, a third-party class from httpx_retries. They also import httpx inside function bodies, violating the project's module-level import standard.
  • Why: Tests should exercise public interfaces, not private internals of third-party libraries (AGENTS.md: "Tests should exercise public interfaces, not _-prefixed functions or classes"). These tests verify a one-line pass-through (RetryTransport(transport=transport, retry=retry)) — pure plumbing. The forwarding contract is already proven end-to-end by the pool-size regression tests in test_native_http_clients.py.
  • Suggestion: Drop all three tests. They only verify plumbing via private attributes of a third-party library. The pool-size regression tests (once reworked per the next finding) already cover the end-to-end contract.

test_native_http_clients.py:299-337 — Tests rely on private methods and 4-layer private attribute chains

  • What: The regression tests call _get_sync_client() / _get_async_client() (private), then walk _transport._sync_transport._pool._max_connections — four private attributes spanning three libraries (data_designer, httpx_retries, httpcore).
  • Why: Any internal rename in httpx_retries or httpcore silently breaks these tests without indicating a real regression. Tests should exercise public interfaces, not reach into private internals.
  • Suggestion: Three changes:
    1. Expose limits as a public read-only property on HttpModelClient — it's already computed and stored, just behind a _ prefix. This is useful beyond tests (logging, diagnostics).
    2. Test the contract, not the internals. The regression test should verify that HttpModelClient computes the right limits and passes them to the inner transport:
      • client.limits.max_connections == 600 (computation is correct)
      • Transport forwarding is already covered by the RetryTransport(transport=transport, ...) call — no separate assertion needed.
    3. Don't call private init methods from tests. If the test needs to trigger lazy initialization, test through the public completion() / acompletion() methods with a mocked httpx client (the existing lazy-init tests already do this).

What Looks Good

  • Root cause analysis is precise. The three-layer problem (limits computed correctly, passed to wrong place, RetryTransport creates default pool) is correctly identified and the fix targets the right layer.
  • The fix is minimal and surgical. Works with httpx_retries's existing transport parameter rather than fighting the library. Removing the dead limits= kwarg is the right cleanup.
  • The production code changes are clean. Both retry.py and http_model_client.py changes are correct and well-structured.

Verdict

Needs changes — The core fix is correct and well-understood. The test changes need rework: (1) drop the three transport-forwarding tests in test_retry.py that reach into third-party private attributes, and (2) rework the pool-size regression tests in test_native_http_clients.py to assert on a public limits property rather than walking private internals across three libraries.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

All contributors have signed the DCO ✍️ ✅
Posted by the DCO Assistant Lite bot.

@przemekboruta
Copy link
Author

I have read the DCO document and I hereby sign the DCO.

przemekboruta and others added 3 commits March 25, 2026 21:18
Pass a pre-configured HTTPTransport/AsyncHTTPTransport with the correct
limits into RetryTransport instead of letting it create its own pool
with httpx defaults (100 connections). Previously, the limits calculated
from max_parallel_requests were passed to httpx.Client(limits=...) which
silently ignores them when a custom transport is provided.

Fixes NVIDIA-NeMo#459

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Przemysław <przemekboruta@interia.pl>
Address Greptile P2 review comments on PR NVIDIA-NeMo#460:
- Add docstring entry for the new `transport` parameter in
  `create_retry_transport` explaining accepted types and None default
- Add inline comments in pool-size regression tests explaining the
  private attribute chain (_sync/_async_transport → _pool → _max_connections)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Przemysław <przemekboruta@interia.pl>
- Drop three tests from test_retry.py that reached into private
  attributes of third-party RetryTransport (_sync_transport,
  _async_transport); the end-to-end contract is covered by the
  pool-size regression test
- Expose a public `limits` property on HttpModelClient so tests and
  diagnostic code can assert the pool configuration without walking
  private attribute chains across three libraries
- Replace two private-chain pool assertions with a single
  `client.limits.max_connections == 600` check against the new property
- Trim "inner" from the transport docstring entry (nabinchha suggestion)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Przemysław <przemekboruta@interia.pl>
@przemekboruta przemekboruta force-pushed the fix/connection-pool-max-parallel-requests branch from de90ceb to 088e037 Compare March 25, 2026 20:18
@przemekboruta przemekboruta requested a review from nabinchha March 25, 2026 20:19
Comment on lines +299 to +311
def test_client_limits_respect_max_parallel_requests() -> None:
"""Connection pool limits must reflect max_parallel_requests (regression for issue #459).

pool_max = max(32, 2 * max_parallel_requests) = max(32, 600) = 600
"""
client = OpenAICompatibleClient(
provider_name=_OPENAI_PROVIDER,
endpoint=_OPENAI_ENDPOINT,
api_key="sk-test",
max_parallel_requests=300,
concurrency_mode=ClientConcurrencyMode.SYNC,
)
assert client.limits.max_connections == 600
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Regression test doesn't verify the actual fix

client.limits.max_connections reads self._limits — a value that was computed correctly even before this PR. The bug was that _limits was never forwarded to the underlying httpcore connection pool. This test passes on the old (broken) code just as easily as on the new code, so it provides no regression protection.

To actually verify the fix, the test needs to trigger lazy transport initialization and then inspect the pool's _max_connections. The PR description lists exactly these missing tests (all four checkboxes are unchecked):

  • test_sync_client_pool_size_respects_max_parallel_requests
  • test_async_client_pool_size_respects_max_parallel_requests
  • test_create_retry_transport_forwards_sync_transport
  • test_create_retry_transport_forwards_async_transport

The existing test is fine as a unit-level sanity check for the limits property, but it should not be the sole regression guard for this bug.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py
Line: 299-311

Comment:
**Regression test doesn't verify the actual fix**

`client.limits.max_connections` reads `self._limits` — a value that was computed correctly even *before* this PR. The bug was that `_limits` was never forwarded to the underlying `httpcore` connection pool. This test passes on the old (broken) code just as easily as on the new code, so it provides no regression protection.

To actually verify the fix, the test needs to trigger lazy transport initialization and then inspect the pool's `_max_connections`. The PR description lists exactly these missing tests (all four checkboxes are unchecked):
- `test_sync_client_pool_size_respects_max_parallel_requests`
- `test_async_client_pool_size_respects_max_parallel_requests`
- `test_create_retry_transport_forwards_sync_transport`
- `test_create_retry_transport_forwards_async_transport`

The existing test is fine as a unit-level sanity check for the `limits` property, but it should not be the sole regression guard for this bug.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max_parallel_requests has no effect on actual HTTP connection pool size

2 participants