chore: progress bar and scheduler polish follow-ups from PR #456

Follow-up items from #456 review (Nabin, Greptile, Codex). All non-blocking polish.

## Progress bar / reporting

- [ ] **`ProgressSnapshot` dataclass** - Replace the 8-tuple returned by `ProgressTracker.get_snapshot()` with a named frozen dataclass. Callers currently unpack with positional `_`-prefixed throwaways.
- [ ] **Gate `StickyProgressBar` on reporter existence** - `async_scheduler.py` creates a `StickyProgressBar` even when there are no CELL_BY_CELL columns (no reporter). Skip creation when reporter is `None`.
- [ ] **Type `_make_wrapper` properly** - `_wrapped_handlers` in `StickyProgressBar` is typed as `list[tuple[StreamHandler, object]]`. Use `Callable[[logging.LogRecord], None]` for the emit reference.
- [ ] **Cache terminal size in `_redraw`** - `shutil.get_terminal_size()` is a syscall called on every bar update. Cache with a short TTL under high throughput.
- [ ] **`_compute_stats_width` rate overflow** - The sample string uses `9999.9 rec/s` which could be exceeded at very high throughput. Low priority.
- [ ] **Race in `StickyProgressBar.__exit__`** - After `_clear_bars()` releases the lock, `_active` is still `True` and handlers are still wrapped. A concurrent log emit can re-`_redraw()` bars that were just cleared, leaving ghost lines on the terminal. Fix: set `_active = False` inside the lock *and* add an `_active` guard in `_redraw()` so threads that sneak past the lock after teardown don't redraw.

## Scheduler accounting

- [ ] **Double-counted skips on non-retryable seed failure** (low likelihood) - When a seed task fails non-retryably, `_execute_task_inner_impl` calls `_drop_row_group` which records skips for all CELL_BY_CELL columns via `_record_skipped_tasks_for_row`. Then `_run_seeds_complete_check` fires (because `is_column_complete_for_rg` counts dropped rows as done) and calls `_record_skipped_tasks_for_row` again for the same rows - the `is_complete` guard only checks `_completed`, not `_dropped`, so skips are double-counted. Fix: snapshot dropped rows before `on_seeds_complete` and only record skips for newly-dropped rows. **Unlikely in practice** - seed columns are typically samplers or simple from_scratch generators that rarely hit LLM APIs. Non-retryable errors (validation, parsing, auth) are config bugs caught during development, not runtime transients. The common failure modes (rate limits, timeouts, 500s, connection errors) are all classified as retryable.

## Scheduler lifecycle

- [ ] **Straggler logs after early shutdown** - When `_early_shutdown` is triggered, `_main_dispatch_loop` salvages deferred tasks and checkpoints, then breaks. But workers that were already in-flight before the flag was set are never cancelled or awaited - `_cancel_workers()` is only called in the `CancelledError` path, not the normal early-shutdown exit. Those orphaned worker coroutines continue running (finishing HTTP requests, retrying, etc.) and their log calls trickle in after `log_final()` / progress bar teardown. Fix: after `_main_dispatch_loop` returns, call `_cancel_workers()` (or at minimum await remaining workers) before `log_final()` so no stragglers outlive the scheduler.

## Slow request diagnostics

- [ ] **Warn on slow HTTP requests** - Add periodic warnings in `HttpModelClient._apost` when a request has been pending longer than a threshold (e.g. 30s). Useful for diagnosing streaming responses that trickle data without timing out.
- [ ] **Scheduler stall warning** - Add a timeout on `_wake_event.wait()` in the main dispatch loop that logs a warning with in-flight/active/deferred counts when no progress is made for 30s.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: progress bar and scheduler polish follow-ups from PR #456 #462

Progress bar / reporting

Scheduler accounting

Scheduler lifecycle

Slow request diagnostics

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

chore: progress bar and scheduler polish follow-ups from PR #456 #462

Description

Progress bar / reporting

Scheduler accounting

Scheduler lifecycle

Slow request diagnostics

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions