Skip to content

feat: low-memory mode, RunningStats, and live progress-bar stats#56

Open
acere wants to merge 1 commit intoawslabs:mainfrom
acere:feature/low-memory-running-stats
Open

feat: low-memory mode, RunningStats, and live progress-bar stats#56
acere wants to merge 1 commit intoawslabs:mainfrom
acere:feature/low-memory-running-stats

Conversation

@acere
Copy link
Copy Markdown
Collaborator

@acere acere commented Apr 1, 2026

Closes #55

What

Adds low-memory mode for large-scale test runs, replaces _builtin_stats with incremental RunningStats, and shows live stats on the progress bar during runs.

Changes

llmeter/utils.py

  • New RunningStats class that accumulates metrics incrementally via sorted value lists. Provides:
    • update() — records one response's metrics
    • to_stats() — computes all stats as raw numeric values (single source of truth)
    • snapshot() — formats a configurable subset for tqdm progress-bar display
    • DEFAULT_SNAPSHOT_STATS — default fields: p50/p90 TTFT, p50/p90 TTLT, median tokens/s, total input/output tokens, failure count

llmeter/runner.py

  • New low_memory parameter on _RunConfig/Runner/run(): writes responses to disk without keeping them in memory. Requires output_path.
  • New progress_bar_stats parameter: configurable live stats on the progress bar.
  • RunningStats is always created and fed during _process_results_from_q. In low-memory mode, responses are not appended to _responses.
  • _run() always populates result._preloaded_stats from RunningStats.to_stats().

llmeter/results.py

  • Removed _builtin_stats @cached_property (and functools.cached_property import).
  • New _compute_stats() classmethod as fallback for manually constructed Results.
  • stats property now reads from _preloaded_stats first, falls back to _compute_stats().
  • load_responses() recomputes _preloaded_stats from loaded data.
  • Result.load(load_responses=True) computes _preloaded_stats from responses.

Tests

  • Updated test_lazy_load.py: replaced _builtin_stats cache invalidation test with _preloaded_stats recomputation test.
  • Updated test_results.py: replaced _builtin_stats caching assertion.

Usage

# Low-memory mode
result = await runner.run(output_path="/tmp/run", low_memory=True)
result.stats              # works (computed incrementally)
result.responses          # [] (not in memory)
result.load_responses()   # loads from disk

# Custom progress-bar stats
result = await runner.run(
    progress_bar_stats={
        "p99_ttlt": ("time_to_last_token", "p99"),
        "tps": ("time_per_output_token", "p50", "inv"),
        "fail": "failed",
    },
)

- Add `low_memory` parameter to Runner/run() that writes responses to
  disk without keeping them in memory, for large-scale test runs.
- Introduce `RunningStats` class that accumulates metrics incrementally
  (counts, sums, sorted values for percentile computation).
- Replace `_builtin_stats` cached_property on Result with `_preloaded_stats`
  populated by RunningStats during the run or from stats.json on load.
- Add `snapshot()` method on RunningStats for live progress-bar display
  of p50/p90 TTFT, p50/p90 TTLT, median tokens/s, total tokens, and
  failure count — configurable via `progress_bar_stats` parameter.
- Add `_compute_stats()` classmethod on Result as fallback for manually
  constructed Result objects and post-load_responses() recomputation.
- Update tests for the new stats flow.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: low-memory mode and live progress-bar stats

1 participant