fix(wren-ai-service): expose batch_size in LitellmEmbedderProvider to cap embedding API call size by octo-patch · Pull Request #2201 · Canner/WrenAI

octo-patch · 2026-04-25T01:50:47Z

Fixes #2031

Problem

Users whose embedding providers enforce a maximum batch size (e.g. max 10 documents per API call) had no way to configure the embedding API batch size from config.yaml.

The internal AsyncDocumentEmbedder.batch_size (which controls how many texts are sent in each call to aembedding()) defaulted to 32 and was not exposed by LitellmEmbedderProvider. Setting column_indexing_batch_size in the settings section has a different effect — it controls how many columns are grouped into a single DDL schema chunk, not how many documents are batched for the embedding API call.

As a result, users received 400 errors like:

litellm.llms.openai.common_utils.OpenAIError: Error code: 400 - batch size is invalid, it should not be larger than 10

Solution

Add batch_size as an explicit named parameter to LitellmEmbedderProvider.__init__() (defaulting to 32) and forward it explicitly to AsyncDocumentEmbedder in get_document_embedder().

Users can now set it per-model in the embedder section of config.yaml:

type: embedder
provider: litellm_embedder
models:
  - model: openai/text-embedding-v4
    alias: default
    batch_size: 10   # limit API call size for providers with batch restrictions

Also added a commented example in config.qwen3.yaml to document the option.

Testing

Existing tests pass unchanged (the default behavior is preserved).
The new parameter propagates through the provider → document embedder chain, capping each aembedding() call to at most batch_size documents.

Summary by CodeRabbit

Release Notes

New Features
- Embedder providers now support a configurable batch size parameter for controlling per-call document batching, useful when embedding providers enforce batch size limits.
Documentation
- Updated embedder configuration examples with documentation for the optional batch size setting and guidance on adjusting it per your provider's requirements.

…to embedding node When Hamilton's AsyncDriver executes the indexing DAG, it wraps async nodes in asyncio Tasks. Under complex MDL schemas with many relationships, the async chunk node's Task was being passed unawaited to the downstream embedding node instead of the actual dict result, causing the embedder to receive an asyncio Task repr string rather than the document chunks. This makes DDLChunker.run() and its helpers synchronous, matching the pattern used by all other indexing pipelines (historical_question, table_description, project_meta). The async machinery in _model_preprocessor was unnecessary since MODEL_PREPROCESSORS is empty by default and all helper operations are CPU-bound string manipulations. Update tests to call chunker.run() synchronously accordingly. Fixes Canner#2138

DDLChunker.run() is now synchronous, so the chunker test cases no longer need pytest.mark.asyncio or async def. Only test_pipeline_run keeps async because it still awaits DBSchema.run.

… cap embedding API calls (fixes Canner#2031) The document embedder's batch_size (how many texts are sent per embedding API call) was hardcoded to 32 inside AsyncDocumentEmbedder and not reachable from config.yaml. Users with embedding providers that enforce a lower batch limit (e.g. max 10) had no way to reduce it — setting column_indexing_batch_size in the settings section only controls DDL chunking, not the embedding API batch size. Add batch_size as an explicit, named parameter to LitellmEmbedderProvider and forward it to get_document_embedder(). Users can now set it per-model in the embedder section of config.yaml: type: embedder provider: litellm_embedder models: - model: openai/text-embedding-v4 alias: default batch_size: 10 # cap API call size for providers with batch limits Also document the option with a commented example in config.qwen3.yaml. Co-Authored-By: Octopus <liyuan851277048@icloud.com>

coderabbitai · 2026-04-25T01:51:00Z

Walkthrough

The PR addresses a configuration issue where column_indexing_batch_size was not taking effect by converting the DDL chunking pipeline from async to synchronous execution and adding configurable batch_size parameter support to the LitellmEmbedderProvider.

Changes

Cohort / File(s)	Summary
Configuration Documentation `wren-ai-service/docs/config_examples/config.qwen3.yaml`	Added commented documentation for optional `batch_size` setting under default embedding model, clarifying it controls per-call document batching.
Async-to-Sync Pipeline Conversion `wren-ai-service/src/pipelines/indexing/db_schema.py`	Converted DDL chunking from async to synchronous: `chunk`, `DDLChunker.run`, `DDLChunker._model_preprocessor`, and `DDLChunker._get_ddl_commands` now execute synchronously; removed asyncio imports and concurrency patterns.
Embedder Provider Enhancement `wren-ai-service/src/providers/embedder/litellm.py`	Added configurable `batch_size` parameter (default 32) to `LitellmEmbedderProvider.__init__` and forward it to `AsyncDocumentEmbedder` for batch control.
Test Updates `wren-ai-service/tests/pytest/pipelines/indexing/test_db_schema.py`	Removed async decorators and `await` keywords from all `DDLChunker` unit tests to match synchronous function signatures; kept assertions and pipeline test unchanged.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

WrenAI#1814: Both PRs add and propagate a batch_size parameter for document embedding—the referenced PR updates AsyncDocumentEmbedder to accept and use batch_size, while this PR wires the configuration option into LitellmEmbedderProvider that is passed downstream.

Suggested labels

wren-ai-service, module/ai-service, ci/ai-service

Suggested reviewers

yichieh-lu
paopa

Poem

🐰 From async chains we break free,
Sync DDL flows, now light and easy!
Batch sizes configured, at last controlled,
The embedding story finally told.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: exposing batch_size in LitellmEmbedderProvider to control embedding API call sizes.
Linked Issues check	✅ Passed	The PR addresses issue `#2031` by adding batch_size parameter to LitellmEmbedderProvider and forwarding it to AsyncDocumentEmbedder, allowing configuration of embedding API call batch sizes to respect provider limits.
Out of Scope Changes check	✅ Passed	The PR includes changes to convert DDLChunker from async to synchronous (fixing `#2138`) and updates tests, which are related to the indexing pipeline where batch_size is applied.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

wren-ai-service/src/providers/embedder/litellm.py (1)
175-175: Optional: consider guarding against non-positive batch_size values.

If a user mis-configures batch_size: 0 (or a negative value) in config.yaml, _embed_batch will raise ValueError: range() arg 3 must not be zero or produce an empty batches list at call time, rather than failing fast at provider initialization. A small validation here would surface the misconfiguration early with a clearer message.
🛡️ Proposed validation
         timeout: float = 120.0,
         batch_size: int = 32,  # number of documents sent per embedding API call
         **kwargs,
     ):
+        if batch_size <= 0:
+            raise ValueError(
+                f"batch_size must be a positive integer, got {batch_size}"
+            )
         self._api_key = os.getenv(api_key_name) if api_key_name else None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren-ai-service/src/providers/embedder/litellm.py` at line 175, Validate the
batch_size parameter at provider initialization (e.g., in the Litellm embedder's
__init__ or factory) to ensure batch_size is a positive integer; if batch_size
<= 0 raise a clear ValueError explaining the misconfiguration (mention
config.yaml) so the issue surfaces at startup instead of inside _embed_batch
where range() or empty batches would fail silently.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@wren-ai-service/src/providers/embedder/litellm.py`:
- Line 175: Validate the batch_size parameter at provider initialization (e.g.,
in the Litellm embedder's __init__ or factory) to ensure batch_size is a
positive integer; if batch_size <= 0 raise a clear ValueError explaining the
misconfiguration (mention config.yaml) so the issue surfaces at startup instead
of inside _embed_batch where range() or empty batches would fail silently.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2d933de8-c44b-4354-a147-41759106deb5

📥 Commits

Reviewing files that changed from the base of the PR and between 5dd32e7 and 6f57f00.

📒 Files selected for processing (4)

wren-ai-service/docs/config_examples/config.qwen3.yaml
wren-ai-service/src/pipelines/indexing/db_schema.py
wren-ai-service/src/providers/embedder/litellm.py
wren-ai-service/tests/pytest/pipelines/indexing/test_db_schema.py

octo-patch and others added 3 commits April 19, 2026 12:44

test(wren-ai-service): drop async/asyncio from chunker-only tests

33999d1

DDLChunker.run() is now synchronous, so the chunker test cases no longer need pytest.mark.asyncio or async def. Only test_pipeline_run keeps async because it still awaits DBSchema.run.

coderabbitai Bot reviewed Apr 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(wren-ai-service): expose batch_size in LitellmEmbedderProvider to cap embedding API call size#2201

fix(wren-ai-service): expose batch_size in LitellmEmbedderProvider to cap embedding API call size#2201
octo-patch wants to merge 3 commits intoCanner:mainfrom
octo-patch:fix/issue-2031-litellm-embedder-batch-size

octo-patch commented Apr 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

octo-patch commented Apr 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

octo-patch commented Apr 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading