feat: low-memory mode, RunningStats, and live progress-bar stats#56
Open
acere wants to merge 1 commit intoawslabs:mainfrom
Open
feat: low-memory mode, RunningStats, and live progress-bar stats#56acere wants to merge 1 commit intoawslabs:mainfrom
acere wants to merge 1 commit intoawslabs:mainfrom
Conversation
- Add `low_memory` parameter to Runner/run() that writes responses to disk without keeping them in memory, for large-scale test runs. - Introduce `RunningStats` class that accumulates metrics incrementally (counts, sums, sorted values for percentile computation). - Replace `_builtin_stats` cached_property on Result with `_preloaded_stats` populated by RunningStats during the run or from stats.json on load. - Add `snapshot()` method on RunningStats for live progress-bar display of p50/p90 TTFT, p50/p90 TTLT, median tokens/s, total tokens, and failure count — configurable via `progress_bar_stats` parameter. - Add `_compute_stats()` classmethod on Result as fallback for manually constructed Result objects and post-load_responses() recomputation. - Update tests for the new stats flow.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #55
What
Adds low-memory mode for large-scale test runs, replaces
_builtin_statswith incrementalRunningStats, and shows live stats on the progress bar during runs.Changes
llmeter/utils.pyRunningStatsclass that accumulates metrics incrementally via sorted value lists. Provides:update()— records one response's metricsto_stats()— computes all stats as raw numeric values (single source of truth)snapshot()— formats a configurable subset for tqdm progress-bar displayDEFAULT_SNAPSHOT_STATS— default fields: p50/p90 TTFT, p50/p90 TTLT, median tokens/s, total input/output tokens, failure countllmeter/runner.pylow_memoryparameter on_RunConfig/Runner/run(): writes responses to disk without keeping them in memory. Requiresoutput_path.progress_bar_statsparameter: configurable live stats on the progress bar.RunningStatsis always created and fed during_process_results_from_q. In low-memory mode, responses are not appended to_responses._run()always populatesresult._preloaded_statsfromRunningStats.to_stats().llmeter/results.py_builtin_stats@cached_property(andfunctools.cached_propertyimport)._compute_stats()classmethod as fallback for manually constructed Results.statsproperty now reads from_preloaded_statsfirst, falls back to_compute_stats().load_responses()recomputes_preloaded_statsfrom loaded data.Result.load(load_responses=True)computes_preloaded_statsfrom responses.Tests
test_lazy_load.py: replaced_builtin_statscache invalidation test with_preloaded_statsrecomputation test.test_results.py: replaced_builtin_statscaching assertion.Usage