feat(run-history): RunEntry v2 (model/cwd/tool_calls/error_excerpt) + lock-free rotation; extend BuiltinAgentOverrideConfig (output/defaultReads/defaultProgress)#231
Open
JayceFreeman wants to merge 2 commits into
Conversation
… lock-free rotation; extend BuiltinAgentOverrideConfig (output/defaultReads/defaultProgress)
Two related changes on the same agents.ts / run-history.ts surface.
## RunEntry v2 + lock-free rotation (run-history.ts + 4 call sites)
RunEntry gains optional model / cwd / tool_calls / error_excerpt so the
history is useful for telemetry, not just agent/status/duration. Sourced at
each of the 4 recordRun call sites from the values actually used to run the
child:
- model <- SingleResult.model
- cwd <- the child's executed cwd (taskCwd / stepCwd /
resolveParallelTaskCwd / effectiveCwd), never the
parent ctx.cwd
- tool_calls <- progressSummary.toolCount (preserves 0; not dropped by
truthiness)
- error_excerpt<- SingleResult.error, truncated to 300 chars, only on error
Rotation is now lock-free. The previous read-path trim raced under
concurrent processes. New design: recordRun appends one JSON line via
O_APPEND (kernel-atomic across processes on a local POSIX fs; entries are
small, single write()), then best-effort maybeRotate trims to ROTATE_KEEP
by writing a per-process-unique temp and atomic-renaming it into place. The
file is a complete valid snapshot either way — no duplication, no
corruption. The only loss this admits is an append landing inside a
rotate's read->rename window; acceptable for telemetry. No new dependency,
no lock file. A 16-process x 150-write storm test guards no-corruption,
no-double-write, and bounded size.
## BuiltinAgentOverrideConfig: output / defaultReads / defaultProgress
These three fields were honored on AgentConfig but silently dropped by the
override schema, parser, applier, AND the write path — so a settings.json
override that set them was a no-op (a lying contract). Threaded through all
of: the schema (BuiltinAgentOverrideBase + Config), parseBuiltinOverrideEntry
(with typed validation errors), applyBuiltinOverride (false sentinel clears
the builtin default), and the save path (cloneOverrideBase / cloneOverrideValue
/ buildBuiltinOverrideConfig) so values round-trip through save+reload instead
of being dropped on persist.
Tests: +6 run-history unit tests (incl. multi-process storm) and +7
agent-overrides tests (apply, false-sentinel clear, malformed-value
rejection, builder round-trip, save+reload persistence). Full suite: 477/477.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR extends run history logging with additional v2 metadata (model/cwd/tool calls/error excerpt), moves log rotation to the write path with a lock-free/atomic-rename strategy, and expands builtin agent overrides to support output/defaultReads/defaultProgress.
Changes:
- Add v2 fields to
RunEntryand pass them through from foreground execution paths. - Implement best-effort log rotation on write (append + optional trim/rename) and adjust loading logic accordingly.
- Add support (and tests) for overriding builtin agent
output,defaultReads, anddefaultProgress, includingfalsesentinels for clearing defaults.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/unit/run-history.test.ts | Adds unit coverage for v2 fields, rotation behavior, and multi-process concurrency invariants. |
| test/unit/agent-overrides.test.ts | Adds tests for new builtin override fields and persistence behavior. |
| src/runs/shared/run-history.ts | Extends run history schema, adds write-path rotation, and changes read-path parsing. |
| src/runs/foreground/subagent-executor.ts | Passes model/cwd/tool_calls/error_excerpt into recordRun for subagent runs. |
| src/runs/foreground/chain-execution.ts | Passes model/cwd/tool_calls/error_excerpt into recordRun for chain steps; minor cwd refactor. |
| src/agents/agents.ts | Adds parsing/applying/building/saving support for output/defaultReads/defaultProgress overrides. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+31
to
+37
| function maybeRotate(historyPath: string): void { | ||
| const lines = fs.readFileSync(historyPath, "utf-8").split("\n").filter((l) => l.trim().length > 0); | ||
| if (lines.length <= ROTATE_READ_THRESHOLD) return; | ||
| const tmp = `${historyPath}.${process.pid}.${Date.now()}.tmp`; | ||
| fs.writeFileSync(tmp, `${lines.slice(-ROTATE_KEEP).join("\n")}\n`); | ||
| fs.renameSync(tmp, historyPath); | ||
| } |
Comment on lines
+83
to
+93
| return raw | ||
| .split("\n") | ||
| .map((line) => line.trim()) | ||
| .filter((line) => line.length > 0) | ||
| .map((line) => { | ||
| try { | ||
| return JSON.parse(line) as RunEntry; | ||
| } catch { | ||
| return undefined; | ||
| } | ||
| }) |
| `); | ||
|
|
||
| const procs = Array.from({ length: PROCS }, (_, k) => new Promise<void>((resolve, reject) => { | ||
| const p = spawn(process.execPath, ["--experimental-strip-types", childFile, pathToFileURL(SRC).href, dir, `p${k}`, String(PER_PROC)], { stdio: "ignore" }); |
The storm test spawned children with a hardcoded `--experimental-strip-types`. That drifts if Node renames the flag and diverges from the integration runner (`--experimental-transform-types --import ...`). Inheriting all of execArgv was rejected: it would leak `--test*`/`--inspect*`/profiler flags into 16 children (orphaned space-form values, debugger-port collisions, silent hangs). Instead, whitelist the TS-loader flags from the parent's execArgv (both `--flag=value` and space-form `--flag value`); unrecognized tokens are simply not copied, so test-runner/debugger/profiler flags and their orphaned values can never reach the child. Byte-identical to the old behavior on the current unit runner; robust if the TS mechanism changes. Full unit suite 477/477.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two related changes on the same
agents.ts/run-history.tssurface. 477/477 unit tests pass (incl. the new ones).RunEntry v2 + lock-free rotation
RunEntrygains optionalmodel/cwd/tool_calls/error_excerpt, sourced at each of the 4recordRuncall sites from the values actually used to run the child — never the parentctx.cwd:modelSingleResult.modelcwdtaskCwd(parallel chain),stepCwd(sequential chain),resolveParallelTaskCwd(...)(parallel path),effectiveCwd(single path)tool_callsprogressSummary.toolCount(preserves0; not dropped by truthiness)error_excerptSingleResult.error, truncated to 300 chars, only on errorRotation is now lock-free. The old read-path trim raced under concurrent processes. New design:
recordRunappends one JSON line viaO_APPEND(kernel-atomic across processes on a local POSIX fs; entries are small → singlewrite()), then a best-effortmaybeRotatetrims toROTATE_KEEPby writing a per-process-unique temp and atomic-renaming it into place. The file is a complete valid snapshot either way — no duplication, no corruption. The only loss this admits is an append landing inside a rotates read→rename window; acceptable for telemetry. No new dependency, no lock file. A 16-process × 150-write storm test guards no-corruption / no-double-write / bounded-size.BuiltinAgentOverrideConfig: output / defaultReads / defaultProgress
These three fields are honored on
AgentConfigbut were silently dropped by the override schema, parser, applier, and the write path — so asettings.jsonoverride setting them was a no-op (a lying contract). Now threaded through all of: schema (BuiltinAgentOverrideBase+Config),parseBuiltinOverrideEntry(typed validation errors),applyBuiltinOverride(thefalsesentinel clears the builtin default), and the save path (cloneOverrideBase/cloneOverrideValue/buildBuiltinOverrideConfig) so values round-trip through save+reload rather than being dropped on persist.Tests
+6 run-history unit tests (incl. multi-process storm) and +7 agent-overrides tests (apply,
false-sentinel clear, malformed-value rejection, builder round-trip, save+reload persistence).