Skip to content

Releases: steipete/summarize

v0.12.0

12 Mar 00:10

Choose a tag to compare

0.12.0 - 2026-03-11

Features

  • Models: add nvidia/... provider alias (uses NVIDIA_API_KEY + optional NVIDIA_BASE_URL) for NVIDIA OpenAI-compatible endpoints.

Fixes

  • Transcription: add AssemblyAI as a first-class remote provider across direct media, podcast/RSS, and yt-dlp YouTube fallback; refactor remote fallback ordering, expand config/env support (ASSEMBLYAI_API_KEY, legacy apiKeys.assemblyai), and add AssemblyAI unit + live coverage (#126).
  • X/Twitter: prefer xurl for tweet extraction when installed, fall back to bird, preserve long-form/article text plus media URLs, add live xurl extraction/media coverage, and replace the stale dead-bird install tip with a current X CLI recommendation (#70).
  • Models: make daemon agent artifacts schemas Gemini-safe, improve Google empty-response handling with preview-to-stable fallback, and switch CLI/auto Gemini defaults away from brittle preview behavior (#82, #96).
  • Agents: expand model auto-resolution errors with checked models, missing env/CLI setup, and daemon restart guidance (#107).
  • Daemon: support multiple saved extension tokens, migrate legacy single-token configs, and accept any configured token for auth (#116).
  • Chrome extension: harden side-panel slides so SSE keepalives no longer false-time out, seeded placeholders no longer block pending/cached slide runs, retries can start a fresh summarize+slides run, and reruns replace stale slide state.
  • Chrome extension: refactor side-panel navigation/run attachment policy so late summary/slide runs no longer attach to the wrong page after tab or URL switches, and expand headless regression coverage for pending-run resume and slide-mode transitions.
  • Chrome extension: default fresh installs to slide mode, keep passive tab navigation out of chat, and align slide cards with CLI --slides by preferring per-slide summary text over raw transcript/OCR fallback.
  • Chrome extension tests: add stronger YouTube slide E2E coverage for loaded images, summary-backed slide text, and switching between videos mid-analysis without stale slide-summary bleed.
  • Chrome extension: isolate slide-summary stream callbacks per run and harden Playwright settings hydration so late events no longer blank slide text when switching videos mid-analysis.
  • Transcription: add Gemini audio/video transcription support across direct media, podcast/RSS, and yt-dlp YouTube fallback, including Files API uploads for larger media plus new Gemini live coverage (#89).
  • npm packaging: publish CLI with pnpm publish so @steipete/summarize-core is version-pinned in published metadata (no workspace:* in registry package).
  • Slides: detect WezTerm as an iTerm-compatible terminal for inline slide images in --slides mode. (#133) — thanks @doodaaatimmy-creator.
  • CLI help: surface summarize refresh-free in summarize help output.
  • CLI: report CLI provider timeouts explicitly, including the duration, command, and a --timeout hint instead of collapsing them into generic exec failures (#100, thanks @christophsturm).
  • Daemon: restrict CORS responses to trusted extension and localhost origins, with regression coverage for allowed and denied Origin headers (#108, thanks @sebastiondev).
  • Transcription: chunk oversized Groq Whisper uploads with ffmpeg in file mode instead of failing out on files above the 30MB limit (#134, thanks @WinnCook).
  • Docs: tighten landing-page mobile layout so hero, cards, code blocks, and nav stay readable on narrow screens (#118, thanks @Acidias).
  • Release: build macOS x64 Bun artifacts and add regression coverage for Homebrew formula rewrites during dual-arch releases (#122, thanks @androidshu).
  • YouTube: tighten hostname validation across core, slides, and extension helpers so attacker-controlled lookalike hosts are no longer treated as YouTube URLs (#91, thanks @RinZ27).
  • Config: honor zai.baseUrl config fallback for blank env values and keep Z.AI base URL overrides working outside the summary flow (#102, thanks @liuy).
  • Chrome extension: tighten options and sidepanel UI spacing, copy actions, and advanced-controls layout for a cleaner panel experience (#86, thanks @morozRed).
  • Slides: warn in summary mode when --slides dependencies are missing, and document required local installs for ffmpeg, yt-dlp, and optional tesseract.
  • Docs: fix broken docs index links by setting an empty Jekyll baseurl (#113, thanks @Youpen-y).
  • Models: preserve model id casing after the provider prefix so OpenAI-compatible proxies can route exact names correctly (#128, thanks @WinnCook).
  • Cache: give extract entries with unavailable transcripts the same short retry TTL as negative transcript cache entries, so transient Apify failures can recover (#115, thanks @gluneau).
  • Daemon: apply the saved env snapshot to process.env before daemon run starts so child tools inherit the right PATH and API/tool config under launchd/systemd (#99, thanks @heyalchang).
  • Chrome automation: require sidepanel arming before debugger-backed native input can run in a tab, and auto-disarm after browser JS execution ends (#129, thanks @omnicoder9).
  • Media setup: fix the local whisper.cpp install hint to use the current Homebrew formula name whisper-cpp (#92, thanks @zerone0x).
  • CLI output: cap markdown render width on very wide terminals by default, with a --width override for manual control (#119, thanks @howardpen9).
  • Slides: size inline slide images from terminal width instead of keeping them pinned to 32 columns, capped at 2x the previous width while preserving COLUMNS fallback behavior (#125, #135, thanks @WinnCook).
  • Shell completions: add Fish shell completions for the current CLI flags and option values (#95, thanks @fbehrens).
  • Bun fetch: only opt into compressed HTML/YouTube responses when running under Bun, and retry link-preview fetches with Accept-Encoding: identity after Bun decompression failures (#105, thanks @maciej).
  • Daemon: support cli/... models in chat and agent endpoints, including CLI auto-fallback when no API-key transport is available (#109, thanks @jetm).

v0.11.1

14 Feb 03:22

Choose a tag to compare

0.11.1 - 2026-02-14

Fixes

  • npm packaging: publish CLI with pnpm publish so @steipete/summarize-core is version-pinned in published metadata (no workspace:* in registry package).

0.11.0 - 2026-02-14

Highlights

  • Auto CLI fallback: new controls and persisted last-success provider state (~/.summarize/cli-state.json) for no-key/local-CLI workflows.
  • Transcription reliability: Groq Whisper is now the preferred cloud transcriber, with custom OpenAI-compatible Whisper endpoint overrides.
  • Input reliability: binary-safe stdin handling, local media support in --extract, and fixes for local-file hangs/PDF preprocessing on custom OpenAI base URLs.

Features

  • CLI: add Cursor Agent provider (--cli agent) for CLI-model execution.
  • CLI auto mode: add implicit auto CLI fallback controls (cli.autoFallback, --auto-cli-fallback) and provider priority controls (cli.providers, --cli-priority), with persisted provider success ordering.
  • Transcription: add Groq Whisper as preferred cloud provider (#71, thanks @n0an).
  • Transcription: support custom OpenAI-compatible Whisper endpoints via OPENAI_WHISPER_BASE_URL (with safe OPENAI_BASE_URL fallback) (#65, thanks @toanbot).
  • Config: support generic env defaults in ~/.summarize/config.json (fallback for any env var), while keeping legacy apiKeys mapping for compatibility (#63, thanks @entropyy0).

Fixes

  • CLI local files: avoid hangs when stream usage never resolves and preprocess PDFs automatically for custom OpenAI-compatible OPENAI_BASE_URL endpoints (e.g. non-api.openai.com).
  • CLI stdin: support binary-safe piping/input temp files to prevent corruption on non-text stdin (#76).
  • Extract mode: allow --extract for local media files (#72).
  • Auto model/daemon fallback: skip model attempts when required API keys are missing and normalize env-key checks in daemon fallback (#67, #78).
  • Cache: for auto presets (auto/free/named auto), prefer preset-level winner cache entries so stale per-candidate cache hits don’t override newer better-model results.
  • Media: treat X broadcasts (/i/broadcasts/...) as transcript-first media and prefer URL mode.
  • YouTube: keep explicit --youtube apify working when HTML fetch fails, while preserving duration metadata parity (#64, thanks @entropyy0).
  • Transcription: stabilize Groq-first fallback flow (no duplicate Groq retries in file mode), improve terminal error reporting, and surface Groq setup in media guidance (#71, thanks @n0an).
  • Media detection: detect more direct media URL extensions including .ogg/.opus (#65, thanks @toanbot).
  • Slides: allow yt-dlp cookies-from-browser via SUMMARIZE_YT_DLP_COOKIES_FROM_BROWSER to avoid YouTube 403s.
  • Daemon install: resolve symlinked/global bin paths and Windows shims when locating the CLI for install (#57, #62, thanks @entropyy0).
  • Extraction: strip hidden HTML + invisible Unicode before summarization or extract output (#61).
  • CLI: honor --lang for YouTube transcript→Markdown conversion in --markdown-mode llm (#56, thanks @entropyy0).
  • LLM: map Anthropic bare model ids to versioned aliases (claude-sonnet-4claude-sonnet-4-0) (#55, thanks @entropyy0).

Improvements

  • Tooling: remove Biome and standardize on oxfmt + type-aware oxlint; pnpm check now enforces format:check before lint/tests.
  • Dependencies: update workspace dependencies to latest (including @mariozechner/pi-ai and oxlint-tsgolint).

v0.10.0

22 Jan 11:53

Choose a tag to compare

Highlights

  • Chrome Side Panel: Chat mode with metrics bar, message queue, and improved context (full transcript + summary metadata, jump-to-latest).
  • Slides: YouTube slide screenshots + OCR + transcript-aligned cards, timestamped seek, and an OCR/Transcript toggle.
  • Media-aware summarization in the Side Panel: Page vs Video/Audio dropdown, automatic media preference on video sites, plus visible word count/duration.
  • CLI: robust URL + media extraction with transcript-first workflows and cache-aware streaming.

Features

  • Slides: extract slide screenshots + OCR for YouTube/direct video URLs in the CLI + extension (#41, thanks @philippb).
  • Slides: top-of-summary slide strip with expand/collapse full-width cards, timestamps, and click-to-seek.
  • Slides: slide descriptions without model calls (transcript windowing, OCR fallback) + OCR/Transcript toggle.
  • Slides: stream slide extraction status/progress and show a single header progress bar (no duplicate spinners).
  • Chrome Side Panel chat: stream agent replies over SSE and restore chat history from daemon cache (#33, thanks @dougvk).
  • Chrome Side Panel chat: timestamped transcript context plus clickable [mm:ss] links that seek the current media.
  • Summaries: when transcript timestamps are available, prompts require timestamped bullet summaries; side panel auto-links [mm:ss] in summaries for media.
  • Transcripts: --timestamps adds segment-level timings (transcriptSegments + transcriptTimedText) for YouTube, podcasts, and embedded captions.
  • Media-aware summarization in the Side Panel: Page vs Video/Audio dropdown, automatic media preference on video sites, plus visible word count/duration.
  • CLI: transcribe local audio/video files with mtime-aware transcript cache invalidation (thanks @mvance!).
  • Browser extension: add Firefox sidebar build + multi-browser config (#31, thanks @vlnd0).
  • Chrome automation: add artifacts tool + REPL helpers for persistent session files (notes/JSON/CSV) and downloads.
  • Chrome automation: expand navigate tool with list/switch tab support and return matching skills after navigation.

Fixes

  • Prompts: ignore sponsor/ads segments in video and podcast summaries.
  • Prompts: enforce no-ads/no-skipped language and italicized standout excerpts (no quotation marks).
  • Media: route direct media URLs to the transcription pipeline and raise the local media limit to 2GB (#47, thanks @n0an).
  • Slides: render Slide X/Y labels and parse slide markers more robustly in streaming output.
  • Slides: ensure slide summary segments start with a title line when missing.
  • Slides: progress updates during yt-dlp downloads and OSC progress mirrors slide extraction.
  • Slides: reuse the media cache for downloaded videos (even with --no-cache).
  • Slides: clear slide progress line before the finish summary to avoid stray Slides x/y output.
  • Slides: parse Slide N/Total labels and stabilize title/body extraction.
  • CLI: --no-cache now bypasses summary caching only; transcript/media caches still apply.
  • Chrome Side Panel chat: keep auto-scroll pinned while streaming when you’re already at the bottom.
  • Chrome Side Panel: scope streams/state per window so other windows don’t wipe active summaries.
  • Chrome Side Panel chat: support JSON agent replies with explicit SSE/JSON negotiation to avoid “stream ended” errors.
  • Chrome Side Panel chat: clear streaming placeholders on errors/aborts.
  • Chrome Side Panel: add inline error toast above chat composer; errors stay visible when scrolled.
  • Chrome Side Panel: clear/hide the inline error toast when no message is present to avoid empty red boxes.
  • Cache: include transcript timestamp requests in extract cache keys so timed summaries don’t reuse plain transcript content.
  • Extract-only: remove implicit 8k cap; new --max-extract-characters/daemon maxExtractCharacters allow opt-in limits; resolves transcript truncation.
  • Automation: require userScripts (no isolated-world fallback), with improved guidance and in-panel permission notice.
  • Daemon: avoid URL flow crashes when url-preference helpers are missing (ReferenceError guard).
  • CLI: clear OSC progress on SIGINT/SIGTERM to avoid stuck indicators.
  • Slides: detect headline-style first lines and render them as slide titles (no required Title: markers).
  • YouTube: prefer English caption variants (en-*) when selecting caption tracks.

Improvements

  • Daemon: emit slides start/progress/done metadata in extended logging for easier debugging.
  • Media: refactor routing helpers and size policy (#48, thanks @steipete).
  • CLI: show determinate transcription progress percent when duration is known.
  • CLI: theme transcription progress lines and mirror part-based progress to OSC when duration is unknown.
  • CLI: show determinate OSC progress for transcription/download when totals are known.
  • CLI: keep OSC progress determinate when recent percent updates are available.
  • CLI: theme tweet/extraction progress lines for consistent loading indicators.
  • CLI: theme file/slide spinner labels so all progress lines share the same styling.
  • CLI: simplify media download labels (avoid “media, video” duplication).
  • Transcription: add auto transcriber selection (default) with ONNX-first when configured + summarize transcriber setup.
  • Slides: cap auto slide targets at 6 by default for long videos.
  • CLI: add themed output (24-bit ANSI), --theme, and config/env defaults for a consistent color scheme.
  • Cache: add media download caching with TTL/size caps + optional verification, plus --no-media-cache.
  • Slides: render headline-style first lines as slide titles above the slide marker.
  • Prompts: allow straight quotes and encourage 1-2 short exact quotes when relevant.

Docs

  • README: 0.10.0 preview layout with clearer install flow, daemon rationale, and prominent Chrome Web Store link.
  • README: document ONNX transcriber setup + auto selection.
  • README/docs: add UI theme config + ONNX install hints.

v0.9.0

31 Dec 00:49

Choose a tag to compare

0.9.0 - 2025-12-31

Highlights

  • Chrome Side Panel: Chat mode with metrics bar, message queue, and improved context (full transcript + summary metadata, jump-to-latest, smoother auto-scroll).
  • Media-aware summarization in the Side Panel: Page vs Video/Audio dropdown, automatic media preference on video sites, plus visible word count/duration.
  • Chrome extension: optional hover tooltip summaries for links (advanced setting, default off; experimental) with prompt customization.

Improvements

  • PDF + asset handling: send PDFs directly to Anthropic/OpenAI/Gemini when supported; generic PDF attachments and better media URL detection.
  • Daemon: /v1/chat + extractOnly, version in health/status pill, optional JSON log with rotation, and more resilient restart/install health checks.
  • Side Panel: advanced model row with “Scan free” (shows top free model after scan), a refresh summary control (cache bypass), plus richer length tooltips.
  • Side Panel UX: consolidated advanced layout and typography controls (font size A/AA, line-height), streamlined setup panel with inline copy, clearer status text, and tighter model/length controls.
  • Side Panel UX: keep the Auto summarize toggle on one line in Advanced.
  • Streaming/metrics polish: faster stream flushes, shorter OpenRouter labels on wrap, and improved extraction metadata in chat.

Fixes

  • Auto model selection: OpenRouter fallback now resolves provider-specific ids (dash/dot slug normalization) and skips fallback when no unique match.
  • Language auto: default to English when detection is uncertain.
  • OpenAI GPT-5: skip temperature in streaming requests to avoid 400s for unsupported params.
  • Side Panel stability: retryable stream errors, no abort crash, auto-summarize on open/source switch, synced chat toggle state, and caret alignment.
  • YouTube duration handling: player API/HTML/yt-dlp fallbacks, transcript metadata propagation, and extension duration fallbacks.
  • URL extraction: preserve final redirected URLs so shorteners (t.co) summarize the real destination.
  • Hover summaries: proxy localhost daemon calls to avoid Chrome “Local network access” prompts.
  • Install: use npm releases for osc-progress/tokentally instead of git deps.

v0.8.2

28 Dec 16:47

Choose a tag to compare

Includes 0.8.2, 0.8.1, and 0.8.0 (rolled up).

Fixed

  • Packaging: ship CLI runtime deps and verify pack installs core + cli tarballs before publish.
  • Packaging: move CLI runtime deps into dependencies so npm installs run cleanly.

Breaking

  • ESM-only: @steipete/summarize + @steipete/summarize-core no longer support CommonJS require(); the CLI binary is now ESM.

Highlights

  • Chrome: add a real Side Panel extension (MV3) that summarizes the current tab and renders streamed Markdown.
  • Daemon: add summarize daemon … (localhost server on 127.0.0.1:8787) for extension ↔ CLI integration.
    • Autostart: macOS LaunchAgent, Linux systemd user service, Windows Scheduled Task
    • Token pairing (shared secret)
    • Streaming over SSE
    • Emit finish-line metrics over SSE (panel footer + hover details)
    • Commands: install, status, restart, uninstall, run
  • Cache: add SQLite cache for transcripts/extractions/summaries with --no-cache, --cache-stats, --clear-cache + config (cache.enabled/maxMb/ttlDays/path).
    • Finish line shows “Cached” for summary cache hits (CLI + daemon/extension)
    • Daemon/Chrome stream cache status metadata (summaryFromCache)

Features

  • YouTube: add --youtube no-auto to skip auto-generated captions and prefer creator-uploaded captions; fall back to yt-dlp transcription (thanks @dougvk!).
  • CLI: add transcript → Markdown formatting via --extract --format md --markdown-mode llm (thanks @dougvk!).
  • X/Twitter: auto-transcribe tweet videos via yt-dlp, using browser cookies (Chrome → Safari → Firefox) when available; set TWITTER_COOKIE_SOURCE / TWITTER_*_PROFILE to control cookie extraction order.
  • Prompt overrides: add --prompt, --prompt-file, and config prompt to replace the default summary instructions.
  • Chrome Side Panel: add length + language controls (presets + custom), forwarded to the daemon.
  • Daemon API: mode: "auto" accepts both url + extracted page text; daemon picks the best pipeline (YouTube/podcasts/media → URL, otherwise prefer visible page text) with a fallback attempt.
  • Daemon/Chrome: stream extra run metadata (inputSummary, modelLabel) over SSE for richer panel status.
  • Core: expose lightweight URL helpers at @steipete/summarize-core/content/url (YouTube/Twitter/podcast/direct-media detection).
  • Chrome Side Panel: new icon + extension homepage_url set to summarize.sh.
  • Providers: add configurable API base URLs (config + env) for OpenAI/Anthropic/Google/xAI (thanks @bunchjesse for the nudge).

Improvements

  • Chrome Side Panel: stream SSE from the panel (no MV3 background stalls), use runtime messaging to avoid “disconnected port” errors, and improve auto-summarize de-dupe.
  • Chrome Side Panel UI: working status in header + 1px progress line (no layout jump), full-width subtitle, page title in header, idle subtitle shows words/chars (or media duration + words) + model, subtle metrics footer, continuous background, and native highlight/link accents.
  • Daemon: prefer the installed env snapshot over launchd’s minimal environment (improves yt-dlp / whisper.cpp PATH reliability, especially for X/Twitter video transcription).
  • X/Twitter: cookie handling now delegates to yt-dlp --cookies-from-browser (no sweet-cookie dependency).
  • X/Twitter: skip yt-dlp transcript attempts for long-form tweet text (articles).
  • Transcripts: show yt-dlp download progress bytes and stabilize totals to prevent bouncing progress bars.
  • Finish line: show transcript source labels (YouTube / podcast) without repeating the label.
  • Streaming: stop/clear progress UI before first streamed output and avoid leading blank lines on non-TTY stdout.
  • URL flow: propagate extracted.truncated into the prompt context so summaries can reflect partial inputs.
  • Daemon: unify URL/page summarization with the CLI flows (single code path; keeps extract/cache/model logic in sync).
  • Prompts: auto-require Markdown section headings for longer summaries (xl/xxl or large custom lengths).

v0.7.1

26 Dec 23:23

Choose a tag to compare

Fixed

  • Packaging: @steipete/summarize-core now ships a CJS build for require() consumers (fixes pnpm dlx @steipete/summarize --help and the published CLI runtime).

0.7.0 - 2025-12-26

Highlights

  • Packages: split into @steipete/summarize-core (library) + @steipete/summarize (CLI; depends on core). Versions are lockstep.
  • Streaming: scrollback-safe Markdown streaming (hybrid: line-by-line + block buffering for fenced code + tables). No cursor control, no full-frame redraws.
  • Output: Markdown rendering is automatic on TTY; use --plain for raw Markdown/text output.
  • Finish line: compact separators (·) and no duplicated … words when transcript stats are shown.
  • YouTube: --youtube auto prefers yt-dlp transcription when available; Apify is last-last resort.

Fixed

  • Streaming: flush newline-bounded output in --plain mode to avoid duplication with cumulative stream chunks.
  • Website extraction: strip inline CSS before Readability to avoid extremely slow jsdom stylesheet parsing on some pages.
  • Twitter/X: rotate Nitter hosts and skip Anubis PoW pages during tweet fallback.

Changed

  • CLI: remove --render; add --plain to keep raw output (no ANSI/OSC rendering).

v0.6.1

25 Dec 14:56

Choose a tag to compare

0.6.1 - 2025-12-25

Changes

  • YouTube: --youtube auto now falls back to yt-dlp if it’s on PATH (or YT_DLP_PATH is set) and a Whisper provider is available.
  • --version now includes a short git SHA when available (build provenance).
  • --extract now defaults to Markdown output (when --format is omitted), preferring Readability input.
  • --extract no longer spends LLM tokens for Markdown conversion by default (unless --markdown-mode llm is used).
  • --format md no longer forces Firecrawl; use --firecrawl always to force it.
  • Finish line in --extract shows the extraction path (e.g. markdown via readability) and omits noisy via html output.
  • Finish line always includes the model id when an LLM is used (including --extract --markdown-mode llm).
  • --extract renders Markdown in TTY output (same renderer as summaries) when --render auto|md (use --render plain for raw Markdown).
  • Suppress transcript progress/failure messages for non-YouTube / non-podcast URLs.
  • Streaming now works with auto-selected models (including --model free) when --stream on|auto.
  • Warn when --length is explicitly provided with --extract (ignored; no summary is generated).

v0.6.0

25 Dec 03:00

Choose a tag to compare

Features

  • Podcasts (full episodes)
    • Support Apple Podcasts episode URLs via iTunes Lookup + enclosure transcription (avoids slow/blocked HTML).
    • Support Spotify episode URLs via the embed page (/embed/episode/...) to avoid recaptcha; fall back to iTunes RSS when embed audio is DRM/missing.
    • Prefer local whisper.cpp when installed + model available (no API keys required for transcription).
    • Whisper transcription works for any media URL (audio/video containers), not just YouTube.
  • Language
    • Add --language/--lang (default: auto, match source language).
    • Add config support via output.language (legacy language still supported).
  • Progress UI
    • Add two-phase progress for podcasts: media download + Whisper transcription progress.
    • Show transcript phases (YouTube caption/Apify/yt-dlp), provider + model, and media size/duration.

Changes

  • Transcription

    • Add lenient ffmpeg transcode fallback for local Whisper when strict decode fails (e.g. Spotify AAC).
  • Models

    • Add zai/... model alias with Z.AI base URL + chat completions by default.
    • Add OPENAI_USE_CHAT_COMPLETIONS + openai.useChatCompletions config toggle.
  • Metrics / output

    • --metrics on|detailed: finish line includes compact transcript stats (… words, …) + media duration (when available); --metrics detailed: also prints input/transcript sizes + transcript source/provider/cache; hides calls=1.
    • Smarter duration formatting (1h 13m 4s, 44s) and rounded transfer rates.
    • Make Markdown links terminal-clickable by materializing URLs.
    • --metrics on|detailed renders a single finish line with a compact transcript block (… words, …) before the model.
  • Cost

    • Include OpenAI Whisper transcription estimate (duration-based) in the finish line total (txcost=…); configurable via openai.whisperUsdPerMinute.

Docs

  • Add docs/language.md and document language config + flag usage.

Tests

  • Add JSON-LD graph extraction coverage.
  • Extend live podcast-host coverage (Podchaser, Spreaker, Buzzsprout).
  • Raise global branch coverage threshold to 75% and add regression coverage for podcast/language/progress paths.

v0.5.0

24 Dec 00:10

Choose a tag to compare

Features

  • Model selection & presets
    • Automatic model selection (--model auto, now the default):
      • Chooses models based on input kind (website/YouTube/file/image/video/text) and prompt size.
      • Skips candidates without API keys; retries next model on request errors.
      • Adds OpenRouter fallback attempts when OPENROUTER_API_KEY is present.
      • Shows the chosen model in the progress UI.
    • Named model presets via config (~/.summarize/config.jsonmodels), selectable as --model <preset>.
    • Built-in preset: --model free (OpenRouter :free candidates; override via models.free).
  • OpenRouter free preset maintenance
    • summarize refresh-free regenerates models.free by scanning OpenRouter :free models and testing availability + latency.
    • summarize refresh-free --set-default also sets "model": "free" in ~/.summarize/config.json (so free becomes your default).
  • CLI models
    • Add --cli <provider> flag (equivalent to --model cli/<provider>).
    • --cli accepts case-insensitive providers and can be used without a provider to enable CLI auto selection.
  • Content extraction
    • Website extraction detects video-only pages:
      • YouTube embeds switch to transcript extraction automatically.
      • Direct video URLs can be downloaded + summarized when --video-mode auto|understand and a Gemini key is available.
  • Env
    • .env in the current directory is loaded automatically (so API keys work without exporting env vars).

Changes

  • CLI config
    • Auto mode uses CLI models only when cli.enabled is set; order follows the list.
    • cli.enabled is an allowlist for CLI usage.
  • OpenRouter
    • Stop sending extra routing headers.
    • --model free: when OpenRouter rejects routing with “No allowed providers”, print the exact provider names to allow and suggest running summarize refresh-free.
    • --max-output-tokens: when explicitly set, it is also forwarded to OpenRouter calls.
  • Refresh Free
    • Default extra runs reduced to 2 (total runs = 1 + runs) to reduce rate-limit pressure.
    • Filter :free candidates by recency (default: last 180 days; configurable via --max-age-days).
    • Print ctx/out in k units for readability.
  • Defaults
    • Default summary length is now xl.

Fixes

  • LLM / OpenRouter
    • LLM request retries (--retries) and clearer timeout errors.
    • summarize refresh-free: detect OpenRouter free-model rate limits and back off + retry.
  • Streaming
    • Normalize + de-dupe overlapping chunks to prevent repeated sections in live Markdown output.
  • YouTube
    • Prefer manual captions over auto-generated when both exist. Thanks @dougvk.
    • Always summarize YouTube transcripts in auto mode (instead of printing the transcript).
  • Prompting & metrics
    • Don’t “pad” beyond input length when asking for longer summaries.
    • --metrics detailed: fold metrics into finish line and make labels less cryptic.

Docs

  • Add documentation for presets and Refresh Free.
  • Add a “make free the default” quick start for summarize refresh-free --set-default.
  • Add a manual end-to-end checklist (docs/manual-tests.md).
  • Add a quick CLI smoke checklist (docs/smoketest.md).
  • Document CLI ordering and model selection behavior.

Tests

  • Add coverage for presets and Refresh Free regeneration.
  • Add live coverage for the free preset.
  • Add regression coverage for YouTube transcript handling and metrics formatting.

v0.4.0

21 Dec 00:09

Choose a tag to compare

Changes

  • Add URL extraction mode via --extract (deprecated alias: --extract-only) with --format md|text.
  • Rename HTML→Markdown conversion flag to --markdown-mode (deprecated alias: --markdown).
  • Add --preprocess off|auto|always and a uvx markitdown fallback for Markdown extraction + unsupported file attachments (when --format md is used).
  • When uvx isn’t available, print an install hint (brew install uv).

Tests

  • Add coverage for preprocess + markitdown integration paths.