A personal benchmarking tool for ASR models across multilingual datasets.
- Supports Whisper & Qwen2.5-Omni
- Languages: Korean, English, Chinese, Russian, French
- Computes WER & CER
- Caching for faster re-runs
openai/whisper-large-v3
openai/whisper-large-v3-turbo
ghost613/whisper-large-v3-turbo-korean
seongsubae/openai-whisper-large-v3-turbo-ko-TEST
(experimental)Qwen/Qwen2.5-Omni-7B
(requires custom Transformers build)
Bingsu/zeroth-korean
(Add more viaload_dataset()
)
python benchmarks/run_benchmark.py --normalize --verbose