Summary
- Minimal Execution Overheads
- Refactor enabling async multi-process/threaded design with just 0.16% overhead in synchronous and 99.9% accuracy for constant requests
- Robust Accuracy + Monitoring
- Built-in timings and diagnostics added to validate performance and catch regressions
- Flexible Benchmarking Profiles
- Prebuilt support for synchronous, concurrent (added), throughput, constant rate, poisson rate, and sweep modes
- Unified Input/Output Formats
- JSON, YAML, CSV, and console output now standardized
- Multi-Use Data Loaders
- Native support for HuggingFace datasets, file-based data, and synthetic samples with fixes for previous flows and expanded support
- Pluggable Backends via OpenAI-Compatible APIs
- Redeisgned to work out of the box with OpenAI style HTTP servers, easily expandable to other interfaces and servers. Fixed issues related to improper token lengths and more
What's Changed
- Add summary metrics to saved json file by @anmarques in #46
- ADD TGI docs by @philschmid in #43
- Add missing vllm docs link by @eldarkurtic in #50
- Change default "role" from "system" to "user" by @philschmid in #53
- FIX TGI example by @philschmid in #51
- Revert Summary Metrics and Expand Test Coverage to Stabilize Nightly/Main CI by @markurtz in #58
- [Dataset]: Iterate through benchmark dataset once by @parfeniukink in #48
- Replace busy wait in async loop with a Semaphore by @sjmonson in #80
- Add backend_kwargs to generate_benchmark_report by @jackcook in #78
- Drop request count check from throughput sweep profile by @sjmonson in #89
- Rework Backend to Native HTTP Requests and Enhance API Compatibility & Performance by @markurtz in #91
- Multi Process Scheduler Implementation, Benchmarker, and Report Generation Refactor by @markurtz in #96
- Update the README by @sjmonson in #112
- Fix units for Req Latency in output to seconds by @smalleni in #113
- Fix/non integer rates by @thameem-abbas in #116
- Output support expansion, code hygiene, and tests by @markurtz in #117
- Bump min python to 3.9 by @sjmonson in #121
- v0.2.0 Version Update and Docs Expansions by @markurtz in #118
- Fix issue if async task count does not evenly divide accross process pool by @sjmonson in #120
- Readme grammar updates and cleanup by @markurtz in #124
- Update CICD flows to enable automated releases and match the feature set laid out in #56 by @markurtz in #125
- CI/CD Build Fixes for Release by @markurtz in #126
New Contributors
- @anmarques made their first contribution in #46
- @philschmid made their first contribution in #43
- @eldarkurtic made their first contribution in #50
- @sjmonson made their first contribution in #80
- @jackcook made their first contribution in #78
- @smalleni made their first contribution in #113
- @thameem-abbas made their first contribution in #116
Full Changelog: v0.1.0...v0.2.0