Skip to content

fix(wren-ai-service): expose batch_size in LitellmEmbedderProvider to cap embedding API call size#2201

Open
octo-patch wants to merge 3 commits intoCanner:mainfrom
octo-patch:fix/issue-2031-litellm-embedder-batch-size
Open

fix(wren-ai-service): expose batch_size in LitellmEmbedderProvider to cap embedding API call size#2201
octo-patch wants to merge 3 commits intoCanner:mainfrom
octo-patch:fix/issue-2031-litellm-embedder-batch-size

Conversation

@octo-patch
Copy link
Copy Markdown

@octo-patch octo-patch commented Apr 25, 2026

Fixes #2031

Problem

Users whose embedding providers enforce a maximum batch size (e.g. max 10 documents per API call) had no way to configure the embedding API batch size from config.yaml.

The internal AsyncDocumentEmbedder.batch_size (which controls how many texts are sent in each call to aembedding()) defaulted to 32 and was not exposed by LitellmEmbedderProvider. Setting column_indexing_batch_size in the settings section has a different effect — it controls how many columns are grouped into a single DDL schema chunk, not how many documents are batched for the embedding API call.

As a result, users received 400 errors like:

litellm.llms.openai.common_utils.OpenAIError: Error code: 400 - batch size is invalid, it should not be larger than 10

Solution

Add batch_size as an explicit named parameter to LitellmEmbedderProvider.__init__() (defaulting to 32) and forward it explicitly to AsyncDocumentEmbedder in get_document_embedder().

Users can now set it per-model in the embedder section of config.yaml:

type: embedder
provider: litellm_embedder
models:
  - model: openai/text-embedding-v4
    alias: default
    batch_size: 10   # limit API call size for providers with batch restrictions

Also added a commented example in config.qwen3.yaml to document the option.

Testing

  • Existing tests pass unchanged (the default behavior is preserved).
  • The new parameter propagates through the provider → document embedder chain, capping each aembedding() call to at most batch_size documents.

Summary by CodeRabbit

Release Notes

  • New Features

    • Embedder providers now support a configurable batch size parameter for controlling per-call document batching, useful when embedding providers enforce batch size limits.
  • Documentation

    • Updated embedder configuration examples with documentation for the optional batch size setting and guidance on adjusting it per your provider's requirements.

octo-patch and others added 3 commits April 19, 2026 12:44
…to embedding node

When Hamilton's AsyncDriver executes the indexing DAG, it wraps async
nodes in asyncio Tasks. Under complex MDL schemas with many relationships,
the async chunk node's Task was being passed unawaited to the downstream
embedding node instead of the actual dict result, causing the embedder to
receive an asyncio Task repr string rather than the document chunks.

This makes DDLChunker.run() and its helpers synchronous, matching the
pattern used by all other indexing pipelines (historical_question,
table_description, project_meta). The async machinery in
_model_preprocessor was unnecessary since MODEL_PREPROCESSORS is empty
by default and all helper operations are CPU-bound string manipulations.

Update tests to call chunker.run() synchronously accordingly.

Fixes Canner#2138
DDLChunker.run() is now synchronous, so the chunker test cases no longer
need pytest.mark.asyncio or async def. Only test_pipeline_run keeps
async because it still awaits DBSchema.run.
… cap embedding API calls (fixes Canner#2031)

The document embedder's batch_size (how many texts are sent per embedding
API call) was hardcoded to 32 inside AsyncDocumentEmbedder and not
reachable from config.yaml.  Users with embedding providers that enforce
a lower batch limit (e.g. max 10) had no way to reduce it — setting
column_indexing_batch_size in the settings section only controls DDL
chunking, not the embedding API batch size.

Add batch_size as an explicit, named parameter to
LitellmEmbedderProvider and forward it to get_document_embedder().
Users can now set it per-model in the embedder section of config.yaml:

  type: embedder
  provider: litellm_embedder
  models:
    - model: openai/text-embedding-v4
      alias: default
      batch_size: 10   # cap API call size for providers with batch limits

Also document the option with a commented example in config.qwen3.yaml.

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 25, 2026

Walkthrough

The PR addresses a configuration issue where column_indexing_batch_size was not taking effect by converting the DDL chunking pipeline from async to synchronous execution and adding configurable batch_size parameter support to the LitellmEmbedderProvider.

Changes

Cohort / File(s) Summary
Configuration Documentation
wren-ai-service/docs/config_examples/config.qwen3.yaml
Added commented documentation for optional batch_size setting under default embedding model, clarifying it controls per-call document batching.
Async-to-Sync Pipeline Conversion
wren-ai-service/src/pipelines/indexing/db_schema.py
Converted DDL chunking from async to synchronous: chunk, DDLChunker.run, DDLChunker._model_preprocessor, and DDLChunker._get_ddl_commands now execute synchronously; removed asyncio imports and concurrency patterns.
Embedder Provider Enhancement
wren-ai-service/src/providers/embedder/litellm.py
Added configurable batch_size parameter (default 32) to LitellmEmbedderProvider.__init__ and forward it to AsyncDocumentEmbedder for batch control.
Test Updates
wren-ai-service/tests/pytest/pipelines/indexing/test_db_schema.py
Removed async decorators and await keywords from all DDLChunker unit tests to match synchronous function signatures; kept assertions and pipeline test unchanged.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • WrenAI#1814: Both PRs add and propagate a batch_size parameter for document embedding—the referenced PR updates AsyncDocumentEmbedder to accept and use batch_size, while this PR wires the configuration option into LitellmEmbedderProvider that is passed downstream.

Suggested labels

wren-ai-service, module/ai-service, ci/ai-service

Suggested reviewers

  • yichieh-lu
  • paopa

Poem

🐰 From async chains we break free,
Sync DDL flows, now light and easy!
Batch sizes configured, at last controlled,
The embedding story finally told.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: exposing batch_size in LitellmEmbedderProvider to control embedding API call sizes.
Linked Issues check ✅ Passed The PR addresses issue #2031 by adding batch_size parameter to LitellmEmbedderProvider and forwarding it to AsyncDocumentEmbedder, allowing configuration of embedding API call batch sizes to respect provider limits.
Out of Scope Changes check ✅ Passed The PR includes changes to convert DDLChunker from async to synchronous (fixing #2138) and updates tests, which are related to the indexing pipeline where batch_size is applied.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
wren-ai-service/src/providers/embedder/litellm.py (1)

175-175: Optional: consider guarding against non-positive batch_size values.

If a user mis-configures batch_size: 0 (or a negative value) in config.yaml, _embed_batch will raise ValueError: range() arg 3 must not be zero or produce an empty batches list at call time, rather than failing fast at provider initialization. A small validation here would surface the misconfiguration early with a clearer message.

🛡️ Proposed validation
         timeout: float = 120.0,
         batch_size: int = 32,  # number of documents sent per embedding API call
         **kwargs,
     ):
+        if batch_size <= 0:
+            raise ValueError(
+                f"batch_size must be a positive integer, got {batch_size}"
+            )
         self._api_key = os.getenv(api_key_name) if api_key_name else None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren-ai-service/src/providers/embedder/litellm.py` at line 175, Validate the
batch_size parameter at provider initialization (e.g., in the Litellm embedder's
__init__ or factory) to ensure batch_size is a positive integer; if batch_size
<= 0 raise a clear ValueError explaining the misconfiguration (mention
config.yaml) so the issue surfaces at startup instead of inside _embed_batch
where range() or empty batches would fail silently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@wren-ai-service/src/providers/embedder/litellm.py`:
- Line 175: Validate the batch_size parameter at provider initialization (e.g.,
in the Litellm embedder's __init__ or factory) to ensure batch_size is a
positive integer; if batch_size <= 0 raise a clear ValueError explaining the
misconfiguration (mention config.yaml) so the issue surfaces at startup instead
of inside _embed_batch where range() or empty batches would fail silently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2d933de8-c44b-4354-a147-41759106deb5

📥 Commits

Reviewing files that changed from the base of the PR and between 5dd32e7 and 6f57f00.

📒 Files selected for processing (4)
  • wren-ai-service/docs/config_examples/config.qwen3.yaml
  • wren-ai-service/src/pipelines/indexing/db_schema.py
  • wren-ai-service/src/providers/embedder/litellm.py
  • wren-ai-service/tests/pytest/pipelines/indexing/test_db_schema.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug Report: column_indexing_batch_size Not Taking Effect with OpenAI-Compatible Embedding Model

1 participant