GitHub - Qinghao-Hu/servebench

VLM

pip install qwen-vl-utils

If meet any error with flash_attn, try

pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.3/flash_attn-2.7.3+cu12torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

Installation

pip install "sglang[all]" --find-links https://flashinfer.ai/whl/cu121/torch2.4/flashinfer/

Run the servers

# Llama 3.1 8B Instruct on single GPU
python -m sglang.launch_server --model-path meta-llama/Llama-3.1-8B-Instruct --enable-torch-compile --disable-radix-cache
python -m vllm.entrypoints.openai.api_server --model meta-llama/Llama-3.1-8B-Instruct --disable-log-requests --num-scheduler-steps 10 --max_model_len 4096

1. Online benchmarks

# bench serving
python3 -m sglang.bench_serving --backend sglang --dataset-name sharegpt --num-prompts 1200 --request-rate 4
python3 -m sglang.bench_serving --backend sglang --dataset-name sharegpt --num-prompts 2400 --request-rate 8
python3 -m sglang.bench_serving --backend vllm --dataset-name sharegpt --num-prompts 1200 --request-rate 4
python3 -m sglang.bench_serving --backend vllm --dataset-name sharegpt --num-prompts 2400 --request-rate 8

2. Offline benchmarks

# bench serving
python3 -m sglang.bench_serving --backend sglang --dataset-name sharegpt --num-prompts 5000
python3 -m sglang.bench_serving --backend vllm --dataset-name sharegpt --num-prompts 5000

Name	Name	Last commit message	Last commit date
Latest commit Qinghao-Hu update Jan 18, 2025 9b493b3 · Jan 18, 2025 History 4 Commits
.github	.github	first commit	Oct 28, 2024
assets	assets	first commit	Oct 28, 2024
benchmark	benchmark	first commit	Oct 28, 2024
docker	docker	first commit	Oct 28, 2024
docs	docs	first commit	Oct 28, 2024
examples	examples	first commit	Oct 28, 2024
python	python	first commit	Oct 28, 2024
scripts	scripts	first commit	Oct 28, 2024
test	test	first commit	Oct 28, 2024
vlm	vlm	update	Jan 18, 2025
.gitignore	.gitignore	first commit	Oct 28, 2024
.gitmodules	.gitmodules	first commit	Oct 28, 2024
.isort.cfg	.isort.cfg	first commit	Oct 28, 2024
.pre-commit-config.yaml	.pre-commit-config.yaml	first commit	Oct 28, 2024
LICENSE	LICENSE	first commit	Oct 28, 2024
README.md	README.md	add vlm benchmark	Jan 18, 2025
analyse.sh	analyse.sh	first commit	Oct 28, 2024
cal.py	cal.py	first commit	Oct 28, 2024
run_servers.sh	run_servers.sh	first commit	Oct 28, 2024
run_sglang.sh	run_sglang.sh	first commit	Oct 28, 2024
run_vllm.sh	run_vllm.sh	first commit	Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLM

Installation

Run the servers

1. Online benchmarks

2. Offline benchmarks

About

Releases

Packages

Languages

License

Qinghao-Hu/servebench

Folders and files

Latest commit

History

Repository files navigation

VLM

Installation

Run the servers

1. Online benchmarks

2. Offline benchmarks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages