scripts/wiki-synthesis: default LLM_MODEL (haiku-4-5, 200k ctx) silently fails on corpora >~200k tokens

At commit `151a8d1c922ffadad08399508efe46b207a5894e`:

```js
// scripts/wiki-synthesis/synthesize-wiki.mjs (around L194)
LLM_MODEL: fileEnv.LLM_MODEL || process.env.LLM_MODEL || "anthropic/claude-haiku-4-5",
```

`synthesize-wiki.mjs` serializes all thoughts within a year bucket into one LLM call (per `--topic`). The default model is Haiku 4.5, which has a 200k context window.

For an 8-employee deployment with ~366 thoughts in the corpus, `--topic autobiography` produces a ~203k-token prompt and aborts:

```
LLM 400: ContextWindowExceededError: prompt is too long: 203107 tokens > 200000 maximum
model=anthropic/claude-haiku-4-5
```

This will hit any deployment whose year-bucket exceeds ~200k tokens, which is a pretty common threshold once a few hundred thoughts accumulate.

## Suggested directions (pick one)

1. **Bump default to a 1M-context model.** `anthropic/claude-sonnet-4-6` with the `anthropic-beta: context-1m-2025-08-07` header gives 1M tokens for ~5x the cost. Topic-wiki runs are typically weekly per topic, so the marginal cost is small in absolute dollars.
2. **Add `--bucket-size N` flag** to chunk year-buckets when the input exceeds a target token count. Keeps Haiku as default; trades single-call narrative coherence for multi-call cost.
3. **At minimum: README note + early validation.** Sum input tokens before the call, abort with a friendly error pointing at the model-choice trade-off, rather than the current behavior (one wasted LLM call, then a crash).

## What I did locally

Patched the default to `anthropic/claude-sonnet-4-6-1m` (a LiteLLM model entry that adds the 1M-context beta header). Topic synthesis then succeeded against the full 366-thought corpus and produced a coherent autobiography wiki.

## Context

Davies Farms self-hosted deployment. Same setup as #313 / #314 / #315. The fix is small and is probably worth landing upstream because the 200k threshold is reachable surprisingly fast.

Happy to PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scripts/wiki-synthesis: default LLM_MODEL (haiku-4-5, 200k ctx) silently fails on corpora >~200k tokens #316

Suggested directions (pick one)

What I did locally

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

scripts/wiki-synthesis: default LLM_MODEL (haiku-4-5, 200k ctx) silently fails on corpora >~200k tokens #316

Description

Suggested directions (pick one)

What I did locally

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions