Skip to content

Parallelism not working with vLLM — Processing multiple files takes linearly longer #43

@08mohitrai-ctrl

Description

@08mohitrai-ctrl

I’m observing that even after integrating vLLM, the parallelism doesn’t seem to be working as expected. Here are the details:

Even with vLLM enabled, parallel execution is not happening — it processes files sequentially.

For example, if processing one file takes X time, then processing 10 files takes roughly 10× that time, which defeats the expected parallel performance gains from vLLM.

I’m looking for faster execution across multiple files. Is there any recommended configuration, optimization, or alternative approach (other than vLLM) that I might be missing here?

Please guide on how to achieve faster multi-file processing using vLLM or suggest any alternative solution if vLLM does not support this use case directly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions