Releases: steipete/summarize
Releases · steipete/summarize
v0.12.0
0.12.0 - 2026-03-11
Features
- Models: add
nvidia/...provider alias (usesNVIDIA_API_KEY+ optionalNVIDIA_BASE_URL) for NVIDIA OpenAI-compatible endpoints.
Fixes
- Transcription: add AssemblyAI as a first-class remote provider across direct media, podcast/RSS, and yt-dlp YouTube fallback; refactor remote fallback ordering, expand config/env support (
ASSEMBLYAI_API_KEY, legacyapiKeys.assemblyai), and add AssemblyAI unit + live coverage (#126). - X/Twitter: prefer
xurlfor tweet extraction when installed, fall back tobird, preserve long-form/article text plus media URLs, add livexurlextraction/media coverage, and replace the stale dead-birdinstall tip with a current X CLI recommendation (#70). - Models: make daemon agent
artifactsschemas Gemini-safe, improve Google empty-response handling with preview-to-stable fallback, and switch CLI/auto Gemini defaults away from brittle preview behavior (#82, #96). - Agents: expand model auto-resolution errors with checked models, missing env/CLI setup, and daemon restart guidance (#107).
- Daemon: support multiple saved extension tokens, migrate legacy single-token configs, and accept any configured token for auth (#116).
- Chrome extension: harden side-panel slides so SSE keepalives no longer false-time out, seeded placeholders no longer block pending/cached slide runs, retries can start a fresh summarize+slides run, and reruns replace stale slide state.
- Chrome extension: refactor side-panel navigation/run attachment policy so late summary/slide runs no longer attach to the wrong page after tab or URL switches, and expand headless regression coverage for pending-run resume and slide-mode transitions.
- Chrome extension: default fresh installs to slide mode, keep passive tab navigation out of chat, and align slide cards with CLI
--slidesby preferring per-slide summary text over raw transcript/OCR fallback. - Chrome extension tests: add stronger YouTube slide E2E coverage for loaded images, summary-backed slide text, and switching between videos mid-analysis without stale slide-summary bleed.
- Chrome extension: isolate slide-summary stream callbacks per run and harden Playwright settings hydration so late events no longer blank slide text when switching videos mid-analysis.
- Transcription: add Gemini audio/video transcription support across direct media, podcast/RSS, and yt-dlp YouTube fallback, including Files API uploads for larger media plus new Gemini live coverage (#89).
- npm packaging: publish CLI with
pnpm publishso@steipete/summarize-coreis version-pinned in published metadata (noworkspace:*in registry package). - Slides: detect WezTerm as an iTerm-compatible terminal for inline slide images in
--slidesmode. (#133) — thanks @doodaaatimmy-creator. - CLI help: surface
summarize refresh-freeinsummarize helpoutput. - CLI: report CLI provider timeouts explicitly, including the duration, command, and a
--timeouthint instead of collapsing them into generic exec failures (#100, thanks @christophsturm). - Daemon: restrict CORS responses to trusted extension and localhost origins, with regression coverage for allowed and denied
Originheaders (#108, thanks @sebastiondev). - Transcription: chunk oversized Groq Whisper uploads with ffmpeg in file mode instead of failing out on files above the 30MB limit (#134, thanks @WinnCook).
- Docs: tighten landing-page mobile layout so hero, cards, code blocks, and nav stay readable on narrow screens (#118, thanks @Acidias).
- Release: build macOS x64 Bun artifacts and add regression coverage for Homebrew formula rewrites during dual-arch releases (#122, thanks @androidshu).
- YouTube: tighten hostname validation across core, slides, and extension helpers so attacker-controlled lookalike hosts are no longer treated as YouTube URLs (#91, thanks @RinZ27).
- Config: honor
zai.baseUrlconfig fallback for blank env values and keep Z.AI base URL overrides working outside the summary flow (#102, thanks @liuy). - Chrome extension: tighten options and sidepanel UI spacing, copy actions, and advanced-controls layout for a cleaner panel experience (#86, thanks @morozRed).
- Slides: warn in summary mode when
--slidesdependencies are missing, and document required local installs forffmpeg,yt-dlp, and optionaltesseract. - Docs: fix broken docs index links by setting an empty Jekyll
baseurl(#113, thanks @Youpen-y). - Models: preserve model id casing after the provider prefix so OpenAI-compatible proxies can route exact names correctly (#128, thanks @WinnCook).
- Cache: give extract entries with unavailable transcripts the same short retry TTL as negative transcript cache entries, so transient Apify failures can recover (#115, thanks @gluneau).
- Daemon: apply the saved env snapshot to
process.envbeforedaemon runstarts so child tools inherit the right PATH and API/tool config under launchd/systemd (#99, thanks @heyalchang). - Chrome automation: require sidepanel arming before debugger-backed native input can run in a tab, and auto-disarm after browser JS execution ends (#129, thanks @omnicoder9).
- Media setup: fix the local whisper.cpp install hint to use the current Homebrew formula name
whisper-cpp(#92, thanks @zerone0x). - CLI output: cap markdown render width on very wide terminals by default, with a
--widthoverride for manual control (#119, thanks @howardpen9). - Slides: size inline slide images from terminal width instead of keeping them pinned to 32 columns, capped at 2x the previous width while preserving
COLUMNSfallback behavior (#125, #135, thanks @WinnCook). - Shell completions: add Fish shell completions for the current CLI flags and option values (#95, thanks @fbehrens).
- Bun fetch: only opt into compressed HTML/YouTube responses when running under Bun, and retry link-preview fetches with
Accept-Encoding: identityafter Bun decompression failures (#105, thanks @maciej). - Daemon: support
cli/...models in chat and agent endpoints, including CLI auto-fallback when no API-key transport is available (#109, thanks @jetm).
v0.11.1
0.11.1 - 2026-02-14
Fixes
- npm packaging: publish CLI with
pnpm publishso@steipete/summarize-coreis version-pinned in published metadata (noworkspace:*in registry package).
0.11.0 - 2026-02-14
Highlights
- Auto CLI fallback: new controls and persisted last-success provider state (
~/.summarize/cli-state.json) for no-key/local-CLI workflows. - Transcription reliability: Groq Whisper is now the preferred cloud transcriber, with custom OpenAI-compatible Whisper endpoint overrides.
- Input reliability: binary-safe stdin handling, local media support in
--extract, and fixes for local-file hangs/PDF preprocessing on custom OpenAI base URLs.
Features
- CLI: add Cursor Agent provider (
--cli agent) for CLI-model execution. - CLI auto mode: add implicit auto CLI fallback controls (
cli.autoFallback,--auto-cli-fallback) and provider priority controls (cli.providers,--cli-priority), with persisted provider success ordering. - Transcription: add Groq Whisper as preferred cloud provider (#71, thanks @n0an).
- Transcription: support custom OpenAI-compatible Whisper endpoints via
OPENAI_WHISPER_BASE_URL(with safeOPENAI_BASE_URLfallback) (#65, thanks @toanbot). - Config: support generic
envdefaults in~/.summarize/config.json(fallback for any env var), while keeping legacyapiKeysmapping for compatibility (#63, thanks @entropyy0).
Fixes
- CLI local files: avoid hangs when stream usage never resolves and preprocess PDFs automatically for custom OpenAI-compatible
OPENAI_BASE_URLendpoints (e.g. non-api.openai.com). - CLI stdin: support binary-safe piping/input temp files to prevent corruption on non-text stdin (#76).
- Extract mode: allow
--extractfor local media files (#72). - Auto model/daemon fallback: skip model attempts when required API keys are missing and normalize env-key checks in daemon fallback (#67, #78).
- Cache: for auto presets (
auto/free/named auto), prefer preset-level winner cache entries so stale per-candidate cache hits don’t override newer better-model results. - Media: treat X broadcasts (
/i/broadcasts/...) as transcript-first media and prefer URL mode. - YouTube: keep explicit
--youtube apifyworking when HTML fetch fails, while preserving duration metadata parity (#64, thanks @entropyy0). - Transcription: stabilize Groq-first fallback flow (no duplicate Groq retries in file mode), improve terminal error reporting, and surface Groq setup in media guidance (#71, thanks @n0an).
- Media detection: detect more direct media URL extensions including
.ogg/.opus(#65, thanks @toanbot). - Slides: allow yt-dlp cookies-from-browser via
SUMMARIZE_YT_DLP_COOKIES_FROM_BROWSERto avoid YouTube 403s. - Daemon install: resolve symlinked/global bin paths and Windows shims when locating the CLI for install (#57, #62, thanks @entropyy0).
- Extraction: strip hidden HTML + invisible Unicode before summarization or extract output (#61).
- CLI: honor
--langfor YouTube transcript→Markdown conversion in--markdown-mode llm(#56, thanks @entropyy0). - LLM: map Anthropic bare model ids to versioned aliases (
claude-sonnet-4→claude-sonnet-4-0) (#55, thanks @entropyy0).
Improvements
- Tooling: remove Biome and standardize on
oxfmt+ type-awareoxlint;pnpm checknow enforcesformat:checkbefore lint/tests. - Dependencies: update workspace dependencies to latest (including
@mariozechner/pi-aiandoxlint-tsgolint).
v0.10.0
Highlights
- Chrome Side Panel: Chat mode with metrics bar, message queue, and improved context (full transcript + summary metadata, jump-to-latest).
- Slides: YouTube slide screenshots + OCR + transcript-aligned cards, timestamped seek, and an OCR/Transcript toggle.
- Media-aware summarization in the Side Panel: Page vs Video/Audio dropdown, automatic media preference on video sites, plus visible word count/duration.
- CLI: robust URL + media extraction with transcript-first workflows and cache-aware streaming.
Features
- Slides: extract slide screenshots + OCR for YouTube/direct video URLs in the CLI + extension (#41, thanks @philippb).
- Slides: top-of-summary slide strip with expand/collapse full-width cards, timestamps, and click-to-seek.
- Slides: slide descriptions without model calls (transcript windowing, OCR fallback) + OCR/Transcript toggle.
- Slides: stream slide extraction status/progress and show a single header progress bar (no duplicate spinners).
- Chrome Side Panel chat: stream agent replies over SSE and restore chat history from daemon cache (#33, thanks @dougvk).
- Chrome Side Panel chat: timestamped transcript context plus clickable
[mm:ss]links that seek the current media. - Summaries: when transcript timestamps are available, prompts require timestamped bullet summaries; side panel auto-links
[mm:ss]in summaries for media. - Transcripts:
--timestampsadds segment-level timings (transcriptSegments+transcriptTimedText) for YouTube, podcasts, and embedded captions. - Media-aware summarization in the Side Panel: Page vs Video/Audio dropdown, automatic media preference on video sites, plus visible word count/duration.
- CLI: transcribe local audio/video files with mtime-aware transcript cache invalidation (thanks @mvance!).
- Browser extension: add Firefox sidebar build + multi-browser config (#31, thanks @vlnd0).
- Chrome automation: add artifacts tool + REPL helpers for persistent session files (notes/JSON/CSV) and downloads.
- Chrome automation: expand navigate tool with list/switch tab support and return matching skills after navigation.
Fixes
- Prompts: ignore sponsor/ads segments in video and podcast summaries.
- Prompts: enforce no-ads/no-skipped language and italicized standout excerpts (no quotation marks).
- Media: route direct media URLs to the transcription pipeline and raise the local media limit to 2GB (#47, thanks @n0an).
- Slides: render Slide X/Y labels and parse slide markers more robustly in streaming output.
- Slides: ensure slide summary segments start with a title line when missing.
- Slides: progress updates during yt-dlp downloads and OSC progress mirrors slide extraction.
- Slides: reuse the media cache for downloaded videos (even with
--no-cache). - Slides: clear slide progress line before the finish summary to avoid stray
Slides x/youtput. - Slides: parse
Slide N/Totallabels and stabilize title/body extraction. - CLI:
--no-cachenow bypasses summary caching only; transcript/media caches still apply. - Chrome Side Panel chat: keep auto-scroll pinned while streaming when you’re already at the bottom.
- Chrome Side Panel: scope streams/state per window so other windows don’t wipe active summaries.
- Chrome Side Panel chat: support JSON agent replies with explicit SSE/JSON negotiation to avoid “stream ended” errors.
- Chrome Side Panel chat: clear streaming placeholders on errors/aborts.
- Chrome Side Panel: add inline error toast above chat composer; errors stay visible when scrolled.
- Chrome Side Panel: clear/hide the inline error toast when no message is present to avoid empty red boxes.
- Cache: include transcript timestamp requests in extract cache keys so timed summaries don’t reuse plain transcript content.
- Extract-only: remove implicit 8k cap; new
--max-extract-characters/daemonmaxExtractCharactersallow opt-in limits; resolves transcript truncation. - Automation: require userScripts (no isolated-world fallback), with improved guidance and in-panel permission notice.
- Daemon: avoid URL flow crashes when url-preference helpers are missing (ReferenceError guard).
- CLI: clear OSC progress on SIGINT/SIGTERM to avoid stuck indicators.
- Slides: detect headline-style first lines and render them as slide titles (no required
Title:markers). - YouTube: prefer English caption variants (
en-*) when selecting caption tracks.
Improvements
- Daemon: emit slides start/progress/done metadata in extended logging for easier debugging.
- Media: refactor routing helpers and size policy (#48, thanks @steipete).
- CLI: show determinate transcription progress percent when duration is known.
- CLI: theme transcription progress lines and mirror part-based progress to OSC when duration is unknown.
- CLI: show determinate OSC progress for transcription/download when totals are known.
- CLI: keep OSC progress determinate when recent percent updates are available.
- CLI: theme tweet/extraction progress lines for consistent loading indicators.
- CLI: theme file/slide spinner labels so all progress lines share the same styling.
- CLI: simplify media download labels (avoid “media, video” duplication).
- Transcription: add auto transcriber selection (default) with ONNX-first when configured +
summarize transcriber setup. - Slides: cap auto slide targets at 6 by default for long videos.
- CLI: add themed output (24-bit ANSI),
--theme, and config/env defaults for a consistent color scheme. - Cache: add media download caching with TTL/size caps + optional verification, plus
--no-media-cache. - Slides: render headline-style first lines as slide titles above the slide marker.
- Prompts: allow straight quotes and encourage 1-2 short exact quotes when relevant.
Docs
- README: 0.10.0 preview layout with clearer install flow, daemon rationale, and prominent Chrome Web Store link.
- README: document ONNX transcriber setup + auto selection.
- README/docs: add UI theme config + ONNX install hints.
v0.9.0
0.9.0 - 2025-12-31
Highlights
- Chrome Side Panel: Chat mode with metrics bar, message queue, and improved context (full transcript + summary metadata, jump-to-latest, smoother auto-scroll).
- Media-aware summarization in the Side Panel: Page vs Video/Audio dropdown, automatic media preference on video sites, plus visible word count/duration.
- Chrome extension: optional hover tooltip summaries for links (advanced setting, default off; experimental) with prompt customization.
Improvements
- PDF + asset handling: send PDFs directly to Anthropic/OpenAI/Gemini when supported; generic PDF attachments and better media URL detection.
- Daemon:
/v1/chat+extractOnly, version in health/status pill, optional JSON log with rotation, and more resilient restart/install health checks. - Side Panel: advanced model row with “Scan free” (shows top free model after scan), a refresh summary control (cache bypass), plus richer length tooltips.
- Side Panel UX: consolidated advanced layout and typography controls (font size A/AA, line-height), streamlined setup panel with inline copy, clearer status text, and tighter model/length controls.
- Side Panel UX: keep the Auto summarize toggle on one line in Advanced.
- Streaming/metrics polish: faster stream flushes, shorter OpenRouter labels on wrap, and improved extraction metadata in chat.
Fixes
- Auto model selection: OpenRouter fallback now resolves provider-specific ids (dash/dot slug normalization) and skips fallback when no unique match.
- Language auto: default to English when detection is uncertain.
- OpenAI GPT-5: skip
temperaturein streaming requests to avoid 400s for unsupported params. - Side Panel stability: retryable stream errors, no abort crash, auto-summarize on open/source switch, synced chat toggle state, and caret alignment.
- YouTube duration handling: player API/HTML/yt-dlp fallbacks, transcript metadata propagation, and extension duration fallbacks.
- URL extraction: preserve final redirected URLs so shorteners (t.co) summarize the real destination.
- Hover summaries: proxy localhost daemon calls to avoid Chrome “Local network access” prompts.
- Install: use npm releases for osc-progress/tokentally instead of git deps.
v0.8.2
Includes 0.8.2, 0.8.1, and 0.8.0 (rolled up).
Fixed
- Packaging: ship CLI runtime deps and verify pack installs core + cli tarballs before publish.
- Packaging: move CLI runtime deps into dependencies so npm installs run cleanly.
Breaking
- ESM-only:
@steipete/summarize+@steipete/summarize-coreno longer support CommonJSrequire(); the CLI binary is now ESM.
Highlights
- Chrome: add a real Side Panel extension (MV3) that summarizes the current tab and renders streamed Markdown.
- Daemon: add
summarize daemon …(localhost server on127.0.0.1:8787) for extension ↔ CLI integration.- Autostart: macOS LaunchAgent, Linux systemd user service, Windows Scheduled Task
- Token pairing (shared secret)
- Streaming over SSE
- Emit finish-line metrics over SSE (panel footer + hover details)
- Commands:
install,status,restart,uninstall,run
- Cache: add SQLite cache for transcripts/extractions/summaries with
--no-cache,--cache-stats,--clear-cache+ config (cache.enabled/maxMb/ttlDays/path).- Finish line shows “Cached” for summary cache hits (CLI + daemon/extension)
- Daemon/Chrome stream cache status metadata (
summaryFromCache)
Features
- YouTube: add
--youtube no-autoto skip auto-generated captions and prefer creator-uploaded captions; fall back toyt-dlptranscription (thanks @dougvk!). - CLI: add transcript → Markdown formatting via
--extract --format md --markdown-mode llm(thanks @dougvk!). - X/Twitter: auto-transcribe tweet videos via
yt-dlp, using browser cookies (Chrome → Safari → Firefox) when available; setTWITTER_COOKIE_SOURCE/TWITTER_*_PROFILEto control cookie extraction order. - Prompt overrides: add
--prompt,--prompt-file, and configpromptto replace the default summary instructions. - Chrome Side Panel: add length + language controls (presets + custom), forwarded to the daemon.
- Daemon API:
mode: "auto"accepts bothurl+ extracted pagetext; daemon picks the best pipeline (YouTube/podcasts/media → URL, otherwise prefer visible page text) with a fallback attempt. - Daemon/Chrome: stream extra run metadata (
inputSummary,modelLabel) over SSE for richer panel status. - Core: expose lightweight URL helpers at
@steipete/summarize-core/content/url(YouTube/Twitter/podcast/direct-media detection). - Chrome Side Panel: new icon + extension
homepage_urlset tosummarize.sh. - Providers: add configurable API base URLs (config + env) for OpenAI/Anthropic/Google/xAI (thanks @bunchjesse for the nudge).
Improvements
- Chrome Side Panel: stream SSE from the panel (no MV3 background stalls), use runtime messaging to avoid “disconnected port” errors, and improve auto-summarize de-dupe.
- Chrome Side Panel UI: working status in header + 1px progress line (no layout jump), full-width subtitle, page title in header, idle subtitle shows
words/chars(or media duration + words) + model, subtle metrics footer, continuous background, and native highlight/link accents. - Daemon: prefer the installed env snapshot over launchd’s minimal environment (improves
yt-dlp/whisper.cppPATH reliability, especially for X/Twitter video transcription). - X/Twitter: cookie handling now delegates to
yt-dlp --cookies-from-browser(no sweet-cookie dependency). - X/Twitter: skip yt-dlp transcript attempts for long-form tweet text (articles).
- Transcripts: show yt-dlp download progress bytes and stabilize totals to prevent bouncing progress bars.
- Finish line: show transcript source labels (
YouTube/podcast) without repeating the label. - Streaming: stop/clear progress UI before first streamed output and avoid leading blank lines on non-TTY stdout.
- URL flow: propagate
extracted.truncatedinto the prompt context so summaries can reflect partial inputs. - Daemon: unify URL/page summarization with the CLI flows (single code path; keeps extract/cache/model logic in sync).
- Prompts: auto-require Markdown section headings for longer summaries (xl/xxl or large custom lengths).
v0.7.1
Fixed
- Packaging:
@steipete/summarize-corenow ships a CJS build forrequire()consumers (fixespnpm dlx @steipete/summarize --helpand the published CLI runtime).
0.7.0 - 2025-12-26
Highlights
- Packages: split into
@steipete/summarize-core(library) +@steipete/summarize(CLI; depends on core). Versions are lockstep. - Streaming: scrollback-safe Markdown streaming (hybrid: line-by-line + block buffering for fenced code + tables). No cursor control, no full-frame redraws.
- Output: Markdown rendering is automatic on TTY; use
--plainfor raw Markdown/text output. - Finish line: compact separators (
·) and no duplicated… wordswhen transcript stats are shown. - YouTube:
--youtube autoprefersyt-dlptranscription when available; Apify is last-last resort.
Fixed
- Streaming: flush newline-bounded output in
--plainmode to avoid duplication with cumulative stream chunks. - Website extraction: strip inline CSS before Readability to avoid extremely slow jsdom stylesheet parsing on some pages.
- Twitter/X: rotate Nitter hosts and skip Anubis PoW pages during tweet fallback.
Changed
- CLI: remove
--render; add--plainto keep raw output (no ANSI/OSC rendering).
v0.6.1
0.6.1 - 2025-12-25
Changes
- YouTube:
--youtube autonow falls back toyt-dlpif it’s onPATH(orYT_DLP_PATHis set) and a Whisper provider is available. --versionnow includes a short git SHA when available (build provenance).--extractnow defaults to Markdown output (when--formatis omitted), preferring Readability input.--extractno longer spends LLM tokens for Markdown conversion by default (unless--markdown-mode llmis used).--format mdno longer forces Firecrawl; use--firecrawl alwaysto force it.- Finish line in
--extractshows the extraction path (e.g.markdown via readability) and omits noisyvia htmloutput. - Finish line always includes the model id when an LLM is used (including
--extract --markdown-mode llm). --extractrenders Markdown in TTY output (same renderer as summaries) when--render auto|md(use--render plainfor raw Markdown).- Suppress transcript progress/failure messages for non-YouTube / non-podcast URLs.
- Streaming now works with auto-selected models (including
--model free) when--stream on|auto. - Warn when
--lengthis explicitly provided with--extract(ignored; no summary is generated).
v0.6.0
Features
- Podcasts (full episodes)
- Support Apple Podcasts episode URLs via iTunes Lookup + enclosure transcription (avoids slow/blocked HTML).
- Support Spotify episode URLs via the embed page (
/embed/episode/...) to avoid recaptcha; fall back to iTunes RSS when embed audio is DRM/missing. - Prefer local
whisper.cppwhen installed + model available (no API keys required for transcription). - Whisper transcription works for any media URL (audio/video containers), not just YouTube.
- Language
- Add
--language/--lang(default:auto, match source language). - Add config support via
output.language(legacylanguagestill supported).
- Add
- Progress UI
- Add two-phase progress for podcasts: media download + Whisper transcription progress.
- Show transcript phases (YouTube caption/Apify/yt-dlp), provider + model, and media size/duration.
Changes
-
Transcription
- Add lenient ffmpeg transcode fallback for local Whisper when strict decode fails (e.g. Spotify AAC).
-
Models
- Add
zai/...model alias with Z.AI base URL + chat completions by default. - Add
OPENAI_USE_CHAT_COMPLETIONS+openai.useChatCompletionsconfig toggle.
- Add
-
Metrics / output
--metrics on|detailed: finish line includes compact transcript stats (… words, …) + media duration (when available);--metrics detailed: also prints input/transcript sizes + transcript source/provider/cache; hidescalls=1.- Smarter duration formatting (
1h 13m 4s,44s) and rounded transfer rates. - Make Markdown links terminal-clickable by materializing URLs.
--metrics on|detailedrenders a single finish line with a compact transcript block (… words, …) before the model.
-
Cost
- Include OpenAI Whisper transcription estimate (duration-based) in the finish line total (
txcost=…); configurable viaopenai.whisperUsdPerMinute.
- Include OpenAI Whisper transcription estimate (duration-based) in the finish line total (
Docs
- Add
docs/language.mdand document language config + flag usage.
Tests
- Add JSON-LD graph extraction coverage.
- Extend live podcast-host coverage (Podchaser, Spreaker, Buzzsprout).
- Raise global branch coverage threshold to 75% and add regression coverage for podcast/language/progress paths.
v0.5.0
Features
- Model selection & presets
- Automatic model selection (
--model auto, now the default):- Chooses models based on input kind (website/YouTube/file/image/video/text) and prompt size.
- Skips candidates without API keys; retries next model on request errors.
- Adds OpenRouter fallback attempts when
OPENROUTER_API_KEYis present. - Shows the chosen model in the progress UI.
- Named model presets via config (
~/.summarize/config.json→models), selectable as--model <preset>. - Built-in preset:
--model free(OpenRouter:freecandidates; override viamodels.free).
- Automatic model selection (
- OpenRouter free preset maintenance
summarize refresh-freeregeneratesmodels.freeby scanning OpenRouter:freemodels and testing availability + latency.summarize refresh-free --set-defaultalso sets"model": "free"in~/.summarize/config.json(so free becomes your default).
- CLI models
- Add
--cli <provider>flag (equivalent to--model cli/<provider>). --cliaccepts case-insensitive providers and can be used without a provider to enable CLI auto selection.
- Add
- Content extraction
- Website extraction detects video-only pages:
- YouTube embeds switch to transcript extraction automatically.
- Direct video URLs can be downloaded + summarized when
--video-mode auto|understandand a Gemini key is available.
- Website extraction detects video-only pages:
- Env
.envin the current directory is loaded automatically (so API keys work without exporting env vars).
Changes
- CLI config
- Auto mode uses CLI models only when
cli.enabledis set; order follows the list. cli.enabledis an allowlist for CLI usage.
- Auto mode uses CLI models only when
- OpenRouter
- Stop sending extra routing headers.
--model free: when OpenRouter rejects routing with “No allowed providers”, print the exact provider names to allow and suggest runningsummarize refresh-free.--max-output-tokens: when explicitly set, it is also forwarded to OpenRouter calls.
- Refresh Free
- Default extra runs reduced to 2 (total runs = 1 + runs) to reduce rate-limit pressure.
- Filter
:freecandidates by recency (default: last 180 days; configurable via--max-age-days). - Print
ctx/outinkunits for readability.
- Defaults
- Default summary length is now
xl.
- Default summary length is now
Fixes
- LLM / OpenRouter
- LLM request retries (
--retries) and clearer timeout errors. summarize refresh-free: detect OpenRouter free-model rate limits and back off + retry.
- LLM request retries (
- Streaming
- Normalize + de-dupe overlapping chunks to prevent repeated sections in live Markdown output.
- YouTube
- Prefer manual captions over auto-generated when both exist. Thanks @dougvk.
- Always summarize YouTube transcripts in auto mode (instead of printing the transcript).
- Prompting & metrics
- Don’t “pad” beyond input length when asking for longer summaries.
--metrics detailed: fold metrics into finish line and make labels less cryptic.
Docs
- Add documentation for presets and Refresh Free.
- Add a “make free the default” quick start for
summarize refresh-free --set-default. - Add a manual end-to-end checklist (
docs/manual-tests.md). - Add a quick CLI smoke checklist (
docs/smoketest.md). - Document CLI ordering and model selection behavior.
Tests
- Add coverage for presets and Refresh Free regeneration.
- Add live coverage for the
freepreset. - Add regression coverage for YouTube transcript handling and metrics formatting.
v0.4.0
Changes
- Add URL extraction mode via
--extract(deprecated alias:--extract-only) with--format md|text. - Rename HTML→Markdown conversion flag to
--markdown-mode(deprecated alias:--markdown). - Add
--preprocess off|auto|alwaysand auvx markitdownfallback for Markdown extraction + unsupported file attachments (when--format mdis used). - When
uvxisn’t available, print an install hint (brew install uv).
Tests
- Add coverage for preprocess + markitdown integration paths.