Parallelism not working with vLLM — Processing multiple files takes linearly longer

I’m observing that even after integrating vLLM, the parallelism doesn’t seem to be working as expected. Here are the details:

Even with vLLM enabled, parallel execution is not happening — it processes files sequentially.

For example, if processing one file takes X time, then processing 10 files takes roughly 10× that time, which defeats the expected parallel performance gains from vLLM.

I’m looking for faster execution across multiple files. Is there any recommended configuration, optimization, or alternative approach (other than vLLM) that I might be missing here?

Please guide on how to achieve faster multi-file processing using vLLM or suggest any alternative solution if vLLM does not support this use case directly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallelism not working with vLLM — Processing multiple files takes linearly longer #43

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parallelism not working with vLLM — Processing multiple files takes linearly longer #43

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions