feat: low-memory mode, RunningStats, and live progress-bar stats by acere · Pull Request #56 · awslabs/llmeter

acere · 2026-04-01T18:41:27Z

Closes #55

What

Adds low-memory mode for large-scale test runs, replaces _builtin_stats with incremental RunningStats, and shows live stats on the progress bar during runs.

Changes

`llmeter/utils.py`

New RunningStats class that accumulates metrics incrementally via sorted value lists. Provides:
- update() — records one response's metrics
- to_stats() — computes all stats as raw numeric values (single source of truth)
- snapshot() — formats a configurable subset for tqdm progress-bar display
- DEFAULT_SNAPSHOT_STATS — default fields: p50/p90 TTFT, p50/p90 TTLT, median tokens/s, total input/output tokens, failure count

`llmeter/runner.py`

New low_memory parameter on _RunConfig/Runner/run(): writes responses to disk without keeping them in memory. Requires output_path.
New progress_bar_stats parameter: configurable live stats on the progress bar.
RunningStats is always created and fed during _process_results_from_q. In low-memory mode, responses are not appended to _responses.
_run() always populates result._preloaded_stats from RunningStats.to_stats().

`llmeter/results.py`

Removed _builtin_stats @cached_property (and functools.cached_property import).
New _compute_stats() classmethod as fallback for manually constructed Results.
stats property now reads from _preloaded_stats first, falls back to _compute_stats().
load_responses() recomputes _preloaded_stats from loaded data.
Result.load(load_responses=True) computes _preloaded_stats from responses.

Tests

Updated test_lazy_load.py: replaced _builtin_stats cache invalidation test with _preloaded_stats recomputation test.
Updated test_results.py: replaced _builtin_stats caching assertion.

Usage

# Low-memory mode
result = await runner.run(output_path="/tmp/run", low_memory=True)
result.stats              # works (computed incrementally)
result.responses          # [] (not in memory)
result.load_responses()   # loads from disk

# Custom progress-bar stats
result = await runner.run(
    progress_bar_stats={
        "p99_ttlt": ("time_to_last_token", "p99"),
        "tps": ("time_per_output_token", "p50", "inv"),
        "fail": "failed",
    },
)

- Add `low_memory` parameter to Runner/run() that writes responses to disk without keeping them in memory, for large-scale test runs. - Introduce `RunningStats` class that accumulates metrics incrementally (counts, sums, sorted values for percentile computation). - Replace `_builtin_stats` cached_property on Result with `_preloaded_stats` populated by RunningStats during the run or from stats.json on load. - Add `snapshot()` method on RunningStats for live progress-bar display of p50/p90 TTFT, p50/p90 TTLT, median tokens/s, total tokens, and failure count — configurable via `progress_bar_stats` parameter. - Add `_compute_stats()` classmethod on Result as fallback for manually constructed Result objects and post-load_responses() recomputation. - Update tests for the new stats flow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: low-memory mode, RunningStats, and live progress-bar stats#56

feat: low-memory mode, RunningStats, and live progress-bar stats#56
acere wants to merge 1 commit intoawslabs:mainfrom
acere:feature/low-memory-running-stats

acere commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

acere commented Apr 1, 2026

What

Changes

llmeter/utils.py

llmeter/runner.py

llmeter/results.py

Tests

Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`llmeter/utils.py`

`llmeter/runner.py`

`llmeter/results.py`