feat: support for OpenAI Batch API

### Priority Level

Medium (Nice to have)

### Is your feature request related to a problem? Please describe.

Data Designer currently makes all LLM calls via the synchronous Chat Completions API (`/v1/chat/completions`). When generating large-scale synthetic datasets through OpenAI, this means paying full price for every request — even though the workload is inherently offline and latency-insensitive.

OpenAI's Batch API (`/v1/batches`) offers 50% cost reduction for asynchronous workloads with a 24-hour turnaround, which is a natural fit for synthetic data generation.

### Describe the solution you'd like

Add an optional `use_batch_api: bool` flag (or a new provider type / execution mode) that routes OpenAI requests through the Batch API instead of the real-time endpoint. A possible implementation could:

1. Collect all pending requests for a column into a JSONL file
2. Submit the batch via `POST /v1/batches`
3. Poll for completion (or use a callback mechanism)
4. Parse the results back into the column pipeline

This could live alongside the existing `max_parallel_requests` concurrency model — users would choose between low-latency real-time generation and cost-optimized batch generation.


### Describe alternatives you've considered

Why This Matters

- **Cost**: 50% savings on OpenAI API costs at scale
- **Rate limits**: Batch API has a separate, more generous rate limit quota
- **Use case fit**: SDG workloads are offline by nature — there's no need for real-time responses

### Additional context

- OpenAI Batch API docs: https://platform.openai.com/docs/guides/batch
- Current architecture processes datasets in batches (`buffer_size`) with parallel cells (`max_parallel_requests`), so integrating a batch submission step per column per buffer could align well with the existing execution model
- Anthropic also offers a similar Message Batches API, so this pattern could generalize to other providers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support for OpenAI Batch API #487

Priority Level

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: support for OpenAI Batch API #487

Description

Priority Level

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions