Chunk batchCreate by byte budget, not just count#65
Merged
Conversation
5 tasks
The fixed BATCH_CREATE_CHUNK = 1000 cap implies an average memory ≤1 KB to fit under the server's 1 MiB request body limit, which is unrealistic for Claude transcripts: a single assistant turn with a large code block or tool result routinely exceeds that on its own. The visible symptom is HTTP 413 'Request body too large' from the importer on the first big chunk, taking the entire chunk's transaction down with it. Replace the count-only loop with a generator that cuts chunks on either a soft byte budget (768 KiB by default — leaves ~256 KiB of headroom under the server's 1 MiB cap for the JSON-RPC envelope and headers) or the existing 1000-item count cap, whichever fires first. A single memory larger than the budget still gets its own singleton chunk so the caller can attempt it; if it then fails server-side, the per-chunk catch records the error without affecting siblings. The chunker is generic and the size function is injected, so the helper can be tested without spinning up an EngineClient.
5ba07a7 to
ff824b9
Compare
Mechanical extraction of the byte-aware chunking primitives into their own file, matching the importers/ directory's one-file-per-concern convention (slug.ts, transcript.ts, uuid.ts, progress.ts, etc.). Moved out of index.ts into chunk.ts: - BATCH_CREATE_CHUNK (now exported) - BATCH_CREATE_BYTES_BUDGET (now exported) - approxMemoryBytes - chunkByBytes Added a one-line wrapper, chunkMemoriesForBatchCreate, that bakes in the importer-shaped defaults so callers can write a tidy `for (const chunk of chunkMemoriesForBatchCreate(toInsert))` instead of repeating the four arguments. The generic chunkByBytes stays exported for callers that need a custom budget. writeSession now uses the wrapper. The chunk tests move from index.test.ts to chunk.test.ts; dedupByMemoryId tests stay in index.test.ts. No behavior change.
The previous implementation returned `JSON.stringify(m).length`, which counts UTF-16 code units rather than wire bytes. For ASCII-only content the two coincide, but Claude transcripts routinely include non-ASCII content (CJK code blocks, emoji, accented Latin) where the undercount is significant: - 2-byte UTF-8 (Latin-1 supp.): 1 unit, 2 bytes → 50% undercount - 3-byte UTF-8 (CJK, emoji): 1 unit, 3 bytes → 67% undercount - 4-byte UTF-8 (supp. plane): 2 units, 4 bytes → 50% undercount A heavily non-ASCII chunk could have its estimated size be 2-3× lower than the actual wire size, eating into the 256 KiB headroom under the 1 MiB server cap and approaching HTTP 413 territory again. Switch to `Buffer.byteLength(_, "utf8")` which returns the exact UTF-8 byte count without allocating an encoded buffer. Idiomatic on both Bun and Node (the importer is CLI-only and runs under Bun, but Buffer is portable across both runtimes). Comment block on BATCH_CREATE_BYTES_BUDGET tightened to say 'UTF-8 wire size' explicitly. New unit test covers the CJK case to lock in that we're sizing in bytes, not characters.
The chunk module was originally created under packages/cli/importers/ because the agent-session importer was the only caller. With `me memory import`, the MCP `me_memory_import` tool, and `me pack install` all about to adopt chunking too, the importers/ subpath is misleading. Move chunk.ts and chunk.test.ts up one level to sit alongside other CLI-wide utilities (client.ts, output.ts, util.ts). Update the importer's import path from `./chunk.ts` to `../chunk.ts`. No code or test changes.
The chunk loop + per-chunk try/catch + accumulator pattern is about to
appear in three more places (me memory import, the MCP me_memory_import
tool, me pack install). Extract the orchestration into a reusable
helper so all four call sites share one source of truth.
batchCreateChunked(client, memories):
- Iterates chunkMemoriesForBatchCreate(memories) sequentially.
- Catches per-chunk failures, records them in `errors`, accumulates
each failed chunk's explicit ids in `failedIds` (and exposes them
per-error via `errors[].ids` for callers that need attribution).
- Returns { insertedIds, failedIds, errors }.
- Structurally typed BatchCreateClient so tests can pass a stub.
Refactored writeSession to use it. Behavior preserved: each failed
chunk contributes chunk.length to outcome.failed, and each id in the
failed chunk gets a row in outcome.errors with the chunk's error
message — matching the previous per-message error attribution.
Six new unit tests for batchCreateChunked covering single-chunk,
two-chunk accumulation, partial failure, total failure, post-#64
shorter-than-input ids, and empty input.
Replace the single `engine.memory.batchCreate` call with a chunked
sequence via `batchCreateChunked`. Large imports — particularly those
with non-trivial per-memory content like exported transcripts or
heavy-meta records — no longer trip HTTP 413 on the way in.
Behavior changes worth knowing:
- Partial success is now possible: a single failed chunk doesn't
take down its siblings, so a run can report imported > 0 AND
failed > 0.
- JSON output's `failed` count and `errors` array now reflect
per-chunk attribution: `source: "chunk N (K items)", error: ...`
instead of the single `source: "server"` row produced by the
pre-chunking server-throw path.
- Skipped-vs-failed disambiguation: ids in failed chunks never
reached the server, so they aren't classified as "skipped — id
already exists." computeSkippedIds is filtered against failedIds
to keep the two categories separate.
Re-running an import after partial failure self-heals: the
already-inserted ids are skipped server-side via ON CONFLICT, and
the missing ids are filled in.
Docs updated to describe the new partial-failure semantics.
Replace the single client.memory.batchCreate call with a chunked
sequence via batchCreateChunked. Large imports — particularly those
with non-trivial per-memory content — no longer trip HTTP 413.
Behavior:
- Returns partial-success detail for mixed outcomes: `imported`,
`skipped`, `failed`, `ids`, `skippedIds`, `errors`. The
`failed` and `errors` fields are new and always present (may be
0/empty) so the schema is stable for agent consumers.
- Throws only on total failure (every chunk failed). For partial
failure the agent gets the full breakdown to react with.
- Failed-chunk ids are excluded from `skippedIds` — they never
reached the server, so they aren't "skipped due to id collision."
idempotentHint stays true. The annotation comment is expanded to
note that retrying after partial failure is safe: already-inserted
ids skip via ON CONFLICT, ids in failed chunks are re-attempted,
and the final state converges to "all submitted ids present."
Replace the single engine.memory.batchCreate call with a chunked
sequence via batchCreateChunked. Large packs no longer trip HTTP 413,
and a single failed chunk doesn't take down its siblings.
classifySkips extended with an optional failedIds parameter so ids
in failed chunks aren't mis-classified as conflicts (they never
reached the server, so there's no skip to classify — they're tracked
separately under the new `failed` bucket).
JSON output additions (existing fields unchanged):
- failed: count of memories in failed chunks
- failedIds: array of failed ids (only when failed > 0)
- errors: per-chunk error detail (only when failed > 0)
Text output additions:
- `└ N failed (chunk error — re-run to retry)` line in the success
block when failed > 0
- clack.log.error block listing per-chunk errors with a hint to
re-run install (re-running self-heals: idempotent skips for the
inserted ids, fresh inserts for the failed ones)
Two new classifySkips tests cover the failedIds parameter.
Also rolls in incidental linter formatting fixes (import order in
importers/index.ts; collapsed chained .map().filter() in chunk.test.ts)
that biome flagged on its second pass after the new code landed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The fixed
BATCH_CREATE_CHUNK = 1000cap implies an average memory ≤1 KB to fit under the server's 1 MiB request body limit, which is unrealistic for Claude transcripts: a single assistant turn with a large code block or tool result routinely exceeds that on its own. The visible symptom is HTTP 413 `Request body too large` from the importer on the first big chunk, taking the entire chunk's transaction down with it.Fix
Replace the count-only loop with a generator that cuts chunks on either:
whichever fires first. A single memory larger than the budget still gets its own singleton chunk so the caller can attempt it; if it then fails server-side, the per-chunk catch records the error without affecting siblings.
The chunker (`chunkByBytes`) is generic and the size function is injected (`approxMemoryBytes` is a thin `JSON.stringify(...).length` wrapper), so the helper can be tested without spinning up an `EngineClient`.
Pairs naturally with #66 (env-configurable `MAX_REQUEST_BODY_BYTES`) — operators who raise the server cap can also bump the importer's budget accordingly.
Test plan