Skip to content

Regarding llama-bench and llama-parallel commands #12106

Closed
@vineelabhinav

Description

@vineelabhinav

Hello @ggerganov @ngxson ,
I have queries regarding the two commands:

  1. llama-bench : How to do test which involves prompt processing(pp) + text generation at same time(tg) ? As of now this command supports only -p option and -n option which gives separate evaluation values for pp and tg but not combined. I see there is -pg option but its not working (says option not found, I dint understand correct format to give for it)
  2. llama-parallel: How parallelism is done on the batch dimension ? Assume I have input of shape [batch_size, M, N] and also there is for loop running over each dimension. Does lllama-parallel parallelizes the batch_size dimension's for loop using openmp parallel for pragma ? If its not case how it does parallelism? Can you mention file where this parallelism code is written?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions