Skip to content

Documentation regarding the DS format to be fed to GuideLLM #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SharonGil opened this issue Apr 24, 2025 · 11 comments · Fixed by #137
Closed

Documentation regarding the DS format to be fed to GuideLLM #133

SharonGil opened this issue Apr 24, 2025 · 11 comments · Fixed by #137

Comments

@SharonGil
Copy link

SharonGil commented Apr 24, 2025

There is a need in writing down examples for all DSs formats that can be fed as an input to the GuideLLM benchmark.

Tried using the example in the current README -
guidellm benchmark --target "http://${IP}:${PORT}" --rate-type sweep --max-seconds 30 --data "prompt_tokens=256,output_tokens=128" -
Got an error for autoTokenizer not known. After debugging a little thought that it might be that tokenizers for Mistarl and Llama3 aren't supported yet (used VLLM instances with those loaded), and the error occurs since processor is required for synthetic data bencmarking.

Then tried using a HF model ID with -
guidellm benchmark --target "http://${IP}:${PORT}" --rate-type synchronous --max-seconds 30 --data ${HF_DS_ID} --data-args '{"prompt_column": "prompt"}' and the benchmark ran but got an error in openAIHTTPBackend for every request, which made me think that the data format I'm sending isn't correct.

then tried to used a local JSON file as a DS and ran with -
guidellm benchmark --target "http://${IP}:${PORT}" --rate-type synchronous --max-seconds 30 --data "prompts.json"
ad got a TypeError: 'PosixPath' object is not iterable error

Would appreciate help in running GuideLLM benchmarks with different DSs configurations.

@neuralmagic neuralmagic deleted a comment from ashishkamra Apr 24, 2025
@sjmonson
Copy link
Collaborator

sjmonson commented Apr 24, 2025

Got an error for autoTokenizer not known.

Did you specify --model or --processor on the command line for guidellm? If you are using the synthetic dataset than GuideLLM needs to know the HF name of the model so that it can load its tokenizer locally.

@sjmonson
Copy link
Collaborator

For the other two issues please add more complete stack traces.

@markurtz
Copy link
Member

Hey @SharonGil, for the second-to-last issue, it looks like the --data was set to HF_MODEL_ID, not sure if this was a typo for the issue or not. But, --data only supports the HF dataset ids. If it is a typo for the issue, a stack trace / example command would be great.

For the last one, we pushed a fix for these on the latest main, and we'll be cutting a v0.2.1 release ideally tomorrow to push them up to PyPI. If you install from source with the main branch, that last example should work correctly now.

@SharonGil
Copy link
Author

@sjmonson I tried both with and without the --model, since it was auto-detected from my VLLm (had only mistarl there).

@SharonGil
Copy link
Author

SharonGil commented Apr 24, 2025

@markurtz Thanks for noticing, yes it was a typo in the issue.
Actually I am about to work on creating a DS generation engine that will be able to construct GuideLLM-ready DSs according to user-defined use-cases and parameters (see issue #134), I find this is a gap still exists. For that purpose mainly need the local JSON DS format for now, which was the last written there.I'll try to install the source with main branch and try again and update.
@sjmonson @markurtz thanks a lot for the prompt help.

@SharonGil
Copy link
Author

Also, is it correct that currently Llama3 and Mistral tokenizers are not yet supported ?

@markurtz
Copy link
Member

Also, is it correct that currently Llama3 and Mistral tokenizers are not yet supported ?

Those should be fully supported -- anything that works through AutoTokenizer / AutoProcessor with HuggingFace will work here. If the name of the model on the server doesn't match the HuggingFace ID, though, then it's not possible for us to automatically look that up from HF. In that case, you'll need to use the --processor flag and pass in either the HF model id that contains the desired processor or pass in a path to a local directory that contains the tokenizer / processor files.

@SharonGil
Copy link
Author

I see. Is there an option to upload few more run examples like you did with the synthetic data for other DS configurations?

@markurtz
Copy link
Member

@SharonGil yes, let me see what I can put together quickly this afternoon, run through some tests on, and then push up in a PR. Will reference in here and tag you. If you need something more immediate than that, let me know the specific use case you're looking at and can get you something over

@SharonGil
Copy link
Author

@markurtz Thanks a lot I appreciate it. No need to hurry, I can wait till your PR. Thanks again.

@markurtz
Copy link
Member

@SharonGil take a look through PR #137 and see if there's anything more needed there

markurtz added a commit that referenced this issue Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants