Conversation
There was a problem hiding this comment.
Pull request overview
Adds a helper script and documentation to run the LLMxCPG-Q model locally via a vLLM OpenAI-compatible server, enabling query generation against a local endpoint.
Changes:
- Added
queries/run_vllm_server.pyto launch a vLLM OpenAI-compatible API server (supports full/merged models and LoRA adapters). - Updated
queries/README.mdwith instructions to start the vLLM server and rungenerate_and_run_queries.pyagainst it.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
queries/run_vllm_server.py |
New CLI helper that builds and runs the vLLM API server command and prints usage hints. |
queries/README.md |
Documents the new local vLLM workflow and example commands. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Thanks @SmallBookworm for the PR! Please take a look at the comments. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| print("Model name for --llm-endpoint:", args.served_model_name) | ||
| print() | ||
| print("Run generate_and_run_queries.py with:") | ||
| print( | ||
| "python generate_and_run_queries.py " | ||
| "-d /path/to/dataset.json " | ||
| "-o /path/to/output_dir " | ||
| "--llm-model-type vLLM " | ||
| f"--llm-endpoint {args.served_model_name} " | ||
| f"--llm-port {args.port}" | ||
| ) |
There was a problem hiding this comment.
This printed helper command uses --llm-endpoint and labels it as a “Model name”, while the README uses --llm-model-name. Please align the flag name across run_vllm_server.py output and the README (and with generate_and_run_queries.py’s actual argparse flags). Otherwise users will copy/paste a command that fails or pass the wrong value type for the option.
This pull request adds support for running the LLMxCPG-Q model locally using a vLLM server, making it easier to generate and run queries with a local OpenAI-compatible endpoint. The main changes include documentation updates and the addition of a helper script to launch the vLLM server.
New vLLM server integration:
Added a new script
run_vllm_server.pyto launch a local OpenAI-compatible vLLM server for LLMxCPG-Q, with support for both merged/full models and LoRA adapters, and various configuration options such as host, port, tensor parallelism, and more.Documentation updates:
Updated
queries/README.mdto provide clear instructions for launching the vLLM server and running the query generation script with the local endpoint, including installation requirements and example commands.