feat: low-memory mode and live progress-bar stats

## Summary

For large-scale test runs, keeping all `InvocationResponse` objects in memory can be expensive — the bulk of the cost comes from `response_text`, `input_payload`, and `input_prompt` fields.

This issue proposes two related features:

### 1. Low-memory mode (`low_memory=True`)

A new `low_memory` parameter on `Runner`/`run()` that writes responses to disk as they arrive but does not accumulate them in the in-memory `_responses` list. Stats are computed incrementally. Requires `output_path` to be set. Responses can be loaded on demand via `result.load_responses()`.

```python
result = await runner.run(output_path="/tmp/my_run", low_memory=True)
result.stats          # works (computed incrementally)
result.responses      # [] (empty — not in memory)
result.load_responses()  # loads from disk when needed
```

### 2. `RunningStats` accumulator

A new `RunningStats` class that tracks metrics incrementally (counts, sums, sorted value lists for percentile computation). This replaces the `_builtin_stats` `@cached_property` on `Result` — stats are now always computed during the run and stored as `_preloaded_stats`, eliminating redundant recomputation.

### 3. Live progress-bar stats

`RunningStats.snapshot()` formats a configurable subset of live stats for tqdm display during the run: p50/p90 TTFT, p50/p90 TTLT, median output tokens/s, total input/output tokens, and failure count. Configurable via the `progress_bar_stats` parameter.

```python
result = await runner.run(
    progress_bar_stats={
        "p99_ttlt": ("time_to_last_token", "p99"),
        "tps": ("time_per_output_token", "p50", "inv"),
        "fail": "failed",
    },
)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: low-memory mode and live progress-bar stats #55

Summary

1. Low-memory mode (`low_memory=True`)

2. `RunningStats` accumulator

3. Live progress-bar stats

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: low-memory mode and live progress-bar stats #55

Description

Summary

1. Low-memory mode (low_memory=True)

2. RunningStats accumulator

3. Live progress-bar stats

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Low-memory mode (`low_memory=True`)

2. `RunningStats` accumulator