use snapshot query#194
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository: get-convex/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR refactors the segment management and work-fetching architecture. kickMainLoop no longer returns segment values; it now performs state transitions and schedules internal.loop.main. Callers (enqueue, cancel, complete) await kickMainLoop and then read the segment via getCurrentSegment() when inserting pending rows. A new runSnapshotQuery helper runs snapshot-style internal queries. main now uses a new getPending snapshot query to fetch completions, cancelations, and candidate starts in one read, processes pre-fetched docs, and advances cursors only when work occurs. Tests and examples are updated accordingly. Sequence Diagram(s)sequenceDiagram
participant Client as enqueue/cancel/complete
participant Kick as kickMainLoop
participant Segment as getCurrentSegment
participant DB as pending tables
participant Snapshot as runSnapshotQuery(getPending)
participant Main as internal.loop.main
Client->>Kick: await kickMainLoop(...)
Kick-->>Client: void
Client->>Segment: getCurrentSegment()
Segment-->>Client: segment
Client->>DB: insert pending* with segment
Main->>Snapshot: runSnapshotQuery(getPending)
Snapshot-->>Main: {completions, cancelations, allStarts}
Main->>Main: slice starts -> handleStart / handleCancelation / handleCompletion
Main->>DB: update internalState cursors / schedule or idle
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
commit: |
3d023b6 to
8fc93f7
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/component/complete.ts (1)
199-207:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep
pendingCompletion.segmentstrictly ahead of the scanned cursor.Line 201 now writes completions at
getCurrentSegment(). With the new cursor semantics (“where we've scanned to”), that can place a completion on an already-scanned segment boundary whilemainis still running, so the row can be skipped indefinitely if the completion query is cursor-exclusive. Please keep new completion rows strictly after the current scan position instead of reusing the current segment.Suggested fix
import { - getCurrentSegment, + getNextSegment, type OnCompleteArgs, type RunResult, vResult, } from "./shared.js"; @@ if (pendingCompletions.length > 0) { await kickMainLoop(ctx, "complete"); - const segment = getCurrentSegment(); + const segment = getNextSegment(); await Promise.all( pendingCompletions.map((completion) => ctx.db.insert("pendingCompletion", {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/component/complete.ts` around lines 199 - 207, The pending completion rows are being written with segment = getCurrentSegment(), which can equal the scanner's current position and cause those rows to be skipped; change the insertion to use a strictly-ahead segment (e.g., compute nextSegment = getCurrentSegment() + 1 or call an existing getNextSegment()/advanceSegment helper) and use that nextSegment when calling ctx.db.insert for pendingCompletions; update uses in the pendingCompletions map and any related tests to ensure the segment type/overflow semantics match the scanner (handle numeric vs. bigint or boundary conditions consistently).
🧹 Nitpick comments (1)
src/component/lib.ts (1)
115-118: ⚡ Quick winUse
getNextSegment()for pending queue writes to ensure visibility on next scan cycle.These inserts currently use
getCurrentSegment(). When newpendingStart/pendingCancelationrows are written at the current segment, they may not be picked up until the next main loop scan cycle, causing processing latency. UsinggetNextSegment()instead guarantees these rows land ahead of the cursor position and will be included in the next scan, improving responsiveness.Suggested fix
import { boundScheduledTime, vConfig, fnType, - getCurrentSegment, + getNextSegment, max, vOnCompleteFnContext, retryBehavior, status as statusValidator, toSegment, @@ await ctx.db.insert("pendingStart", { workId, - segment: max(toSegment(runAt), getCurrentSegment()), + segment: max(toSegment(runAt), getNextSegment()), }); @@ await ctx.db.insert("pendingCancelation", { workId: id, - segment: getCurrentSegment(), + segment: getNextSegment(), }); @@ - const segment = getCurrentSegment(); + const segment = getNextSegment();Also applies to: 147–150, 177–186
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/component/lib.ts` around lines 115 - 118, The pending queue inserts currently compute segment using getCurrentSegment(), which can place new pendingStart/pendingCancelation rows at the cursor and delay pickup; update the INSERTs that call ctx.db.insert("pendingStart", { workId, segment: max(toSegment(runAt), getCurrentSegment()) }) and the similar pendingCancelation inserts to use getNextSegment() instead of getCurrentSegment() (i.e., segment: max(toSegment(runAt), getNextSegment())) so new rows land ahead of the scan cursor and are visible on the next cycle; apply this change to all occurrences (the blocks creating "pendingStart" and "pendingCancelation", including the other mentioned locations).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/component/future.ts`:
- Around line 9-37: The function runSnapshotQuery currently depends on the
undocumented Convex.asyncSyscall with udfType: "snapshotQuery" (see
Convex.asyncSyscall, udfType: "snapshotQuery", getFunctionAddress, convexToJson,
jsonToConvex), which is unsupported; replace this internal syscall usage by
wiring into the documented runQuery API: change runSnapshotQuery to
accept/obtain a Convex execution context (the ctx used in mutations) and call
ctx.runQuery(query, ...args) (or call the public client runQuery helper) instead
of constructing syscallArgs and calling Convex.asyncSyscall, remove
getFunctionAddress/convexToJson/jsonToConvex usage, and return the ctx.runQuery
result cast to FunctionReturnType<Query> so behavior remains snapshot-consistent
using the supported API.
In `@src/component/loop.ts`:
- Around line 143-151: Rebuild the queryArgs object immediately before the
confirm read (i.e., after calling handleRecovery and right before the
confirm/getPendingWork call) so runningCount reflects the post-recovery state;
specifically, recreate queryArgs (using state.segmentCursors,
CURSOR_BUFFER_SEGMENTS, globals.maxParallelism and state.running.length) instead
of reusing the earlier captured one in the didWork === false path (also apply
the same change where queryArgs is used around the other occurrences referenced
near lines with similar blocks), ensuring getPendingWork computes correct
startLimit based on the updated runningCount.
---
Outside diff comments:
In `@src/component/complete.ts`:
- Around line 199-207: The pending completion rows are being written with
segment = getCurrentSegment(), which can equal the scanner's current position
and cause those rows to be skipped; change the insertion to use a strictly-ahead
segment (e.g., compute nextSegment = getCurrentSegment() + 1 or call an existing
getNextSegment()/advanceSegment helper) and use that nextSegment when calling
ctx.db.insert for pendingCompletions; update uses in the pendingCompletions map
and any related tests to ensure the segment type/overflow semantics match the
scanner (handle numeric vs. bigint or boundary conditions consistently).
---
Nitpick comments:
In `@src/component/lib.ts`:
- Around line 115-118: The pending queue inserts currently compute segment using
getCurrentSegment(), which can place new pendingStart/pendingCancelation rows at
the cursor and delay pickup; update the INSERTs that call
ctx.db.insert("pendingStart", { workId, segment: max(toSegment(runAt),
getCurrentSegment()) }) and the similar pendingCancelation inserts to use
getNextSegment() instead of getCurrentSegment() (i.e., segment:
max(toSegment(runAt), getNextSegment())) so new rows land ahead of the scan
cursor and are visible on the next cycle; apply this change to all occurrences
(the blocks creating "pendingStart" and "pendingCancelation", including the
other mentioned locations).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: get-convex/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ea3f2867-4b07-4a46-b466-b5db083b631c
⛔ Files ignored due to path filters (2)
package-lock.jsonis excluded by!**/package-lock.jsonsrc/component/_generated/api.tsis excluded by!**/_generated/**
📒 Files selected for processing (10)
package.jsonsrc/component/complete.tssrc/component/future.tssrc/component/kick.test.tssrc/component/kick.tssrc/component/lib.tssrc/component/loop.test.tssrc/component/loop.tssrc/component/schema.tssrc/component/stateMachine.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/component/loop.ts (1)
143-151:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRebuild
queryArgsafter recovery before the confirm read.
queryArgs.runningCountis captured beforehandleRecovery, but recovery can shrinkstate.running. In thedidWork === falsepath, the confirm query can still computestartLimit = 0and misspendingStartrows that became runnable only because recovery freed capacity.Suggested fix
- const queryArgs = { + const buildQueryArgs = () => ({ completionCursor: state.segmentCursors.completion - CURSOR_BUFFER_SEGMENTS, cancelationCursor: state.segmentCursors.cancelation - CURSOR_BUFFER_SEGMENTS, incomingCursor: state.segmentCursors.incoming - CURSOR_BUFFER_SEGMENTS, maxParallelism: globals.maxParallelism, runningCount: state.running.length, - }; + }); + const queryArgs = buildQueryArgs(); ... - const confirm = await ctx.runQuery(internal.loop.getPending, queryArgs); + const confirm = await ctx.runQuery( + internal.loop.getPending, + buildQueryArgs(), + );Also applies to: 173-176, 236-247
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/component/loop.ts` around lines 143 - 151, Recompute the queryArgs after running handleRecovery so the runningCount reflects any threads freed by recovery: move or recreate the queryArgs object (the one with completionCursor, cancelationCursor, incomingCursor, maxParallelism, runningCount) to after handleRecovery is called (and before the confirm/read that uses startLimit/pendingStart) so that runningCount uses the updated state.running; ensure similar updates where queryArgs was built earlier (notably the other occurrences around the didWork === false path and the regions referenced) so confirm reads won't miss rows that became runnable due to recovery.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/component/loop.ts`:
- Around line 208-222: The cursor updates can be moved backwards by
buffered/late rows; fix by never decreasing any cursor when assigning from
completions.at(-1), cancelations.at(-1) or pending.at(-1). Replace direct
assignments in the loop that set state.segmentCursors.completion, .cancelation
and .incoming with logic that takes the max of the existing cursor and the
candidate segment (e.g., use Math.max(existingCursor, candidateSegment) for
completion and cancelation, and likewise for incoming when using pending.at(-1)
or segment), so cursors only advance or stay the same; keep the existing branch
behavior around actualCapacity unchanged.
---
Duplicate comments:
In `@src/component/loop.ts`:
- Around line 143-151: Recompute the queryArgs after running handleRecovery so
the runningCount reflects any threads freed by recovery: move or recreate the
queryArgs object (the one with completionCursor, cancelationCursor,
incomingCursor, maxParallelism, runningCount) to after handleRecovery is called
(and before the confirm/read that uses startLimit/pendingStart) so that
runningCount uses the updated state.running; ensure similar updates where
queryArgs was built earlier (notably the other occurrences around the didWork
=== false path and the regions referenced) so confirm reads won't miss rows that
became runnable due to recovery.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: get-convex/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 208b2d91-89b8-416e-9a31-aac7852d8ed2
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (4)
CHANGELOG.mdpackage.jsonsrc/component/loop.test.tssrc/component/loop.ts
✅ Files skipped from review due to trivial changes (2)
- CHANGELOG.md
- package.json
🚧 Files skipped from review as they are similar to previous changes (1)
- src/component/loop.test.ts
| // Advance cursors to skip tombstones on next scan. Only do this when | ||
| // we actually did work — the cursor doubles as the cooldown signal | ||
| // ("how long since we last processed something"). | ||
| const didWork = | ||
| completions.length > 0 || cancelations.length > 0 || pending.length > 0; | ||
| if (didWork) { | ||
| state.segmentCursors.completion = completions.at(-1)?.segment ?? segment; | ||
| state.segmentCursors.cancelation = | ||
| cancelations.at(-1)?.segment ?? segment; | ||
| if (pending.length > 0) { | ||
| state.segmentCursors.incoming = pending.at(-1)!.segment; | ||
| } else if (actualCapacity > 0) { | ||
| // We have no more pending work, update to now | ||
| state.segmentCursors.incoming = segment; | ||
| } |
There was a problem hiding this comment.
Don't let buffered rows move the cursors backwards.
These queries intentionally read behind the cursor. If a late commit returns an older segment here, writing completions.at(-1)?.segment / cancelations.at(-1)?.segment back into state.segmentCursors rewinds the cooldown marker. The next no-work pass can then go idle immediately after processing real work, which is exactly the window this buffer is supposed to protect.
Suggested fix
if (didWork) {
- state.segmentCursors.completion = completions.at(-1)?.segment ?? segment;
- state.segmentCursors.cancelation =
- cancelations.at(-1)?.segment ?? segment;
+ if (completions.length > 0) {
+ state.segmentCursors.completion = segment;
+ }
+ if (cancelations.length > 0) {
+ state.segmentCursors.cancelation = segment;
+ }
if (pending.length > 0) {
- state.segmentCursors.incoming = pending.at(-1)!.segment;
+ state.segmentCursors.incoming = segment;
} else if (actualCapacity > 0) {
- // We have no more pending work, update to now
+ // We have no more pending work, update to now.
state.segmentCursors.incoming = segment;
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/component/loop.ts` around lines 208 - 222, The cursor updates can be
moved backwards by buffered/late rows; fix by never decreasing any cursor when
assigning from completions.at(-1), cancelations.at(-1) or pending.at(-1).
Replace direct assignments in the loop that set state.segmentCursors.completion,
.cancelation and .incoming with logic that takes the max of the existing cursor
and the candidate segment (e.g., use Math.max(existingCursor, candidateSegment)
for completion and cancelation, and likewise for incoming when using
pending.at(-1) or segment), so cursors only advance or stay the same; keep the
existing branch behavior around actualCapacity unchanged.
There was a problem hiding this comment.
@coderabbitai can you clarify what you mean by this? what is a "no-work pass"?
There was a problem hiding this comment.
@reeceyang Sure! Let me trace through the exact scenario.
What is a "no-work pass"?
The main loop runs repeatedly. Each iteration checks didWork:
const didWork =
completions.length > 0 || cancelations.length > 0 || pending.length > 0;- Work pass:
didWork === true→ cursors are advanced, and the loop reschedules itself immediately. - No-work pass:
didWork === false→ the loop does an OCC-confirming re-read and then decides whether to go idle, stay in cooldown, or schedule for the future.
The cooldown mechanism
The cooldown lives inside the no-work pass:
const { incoming, completion, cancelation } = state.segmentCursors;
const latestCursor = fromSegment(max(incoming, max(completion, cancelation)));
if (Date.now() - latestCursor < STATUS_COOLDOWN) {
// stay running a bit longer, check again
}The idea is: "if we processed real work recently (within STATUS_COOLDOWN), don't go idle yet — more work might arrive." The freshness is measured by how large the cursor segment values are (a segment encodes a timestamp via toSegment/fromSegment).
How the cursor rewind breaks this
Here's the problematic scenario step by step:
-
Pass N (work pass): The snapshot query reads
pendingCompletionwith a buffer (completionCursor - CURSOR_BUFFER_SEGMENTS), so it may return a late-committed row whose.segmentis an old value (e.g., 30 seconds in the past).completions.at(-1)?.segmentreturns that old segment, and the code writes it intostate.segmentCursors.completion. -
Pass N+1 (no-work pass): No new work is found. The cooldown check computes
fromSegment(latestCursor)using the old segment that was just written. Because that segment corresponds to a timestamp ~30 seconds ago,Date.now() - latestCursor >= STATUS_COOLDOWNis already true — so the cooldown window is skipped and the loop goes idle, even though real work was just processed moments ago.
The fix (always setting the cursor to segment = getCurrentSegment() when work is processed) ensures the cooldown clock always reflects "right now," so the loop stays alive for a full STATUS_COOLDOWN window after the last batch of real work.
🧠 Learnings used
Learnt from: ianmacartney
Repo: get-convex/workpool PR: 170
File: src/component/danger.ts:36-43
Timestamp: 2026-02-19T08:37:48.834Z
Learning: In Convex (v1.31.0+), ctx.db.delete supports both two-argument form (ctx.db.delete(tableName, id)) and single-argument form (ctx.db.delete(id)). Both are valid; prefer the two-argument form for clarity when deleting by table and id, and use the single-argument form only when the intent is purely to delete by id without explicit table context. Ensure code consistently follows one form across a module or project where appropriate.
There was a problem hiding this comment.
@coderabbitai it does seem like this could be an issue, but only in the rare case that incoming, completion, and cancelation are all rewound.
@ianmacartney I wonder if it would be simpler to keep track of the cooldown signal separately instead of using the cursors here?
There was a problem hiding this comment.
@reeceyang That's a good refinement — you're right that the bug requires the maximum of all three cursors to end up old, not just one.
Looking more carefully at the incoming cursor: if actualCapacity > 0 and no pending starts are found, it's set to segment (the current time), which would keep the max cursor fresh. So the narrowest scenario where all three rewind simultaneously is when:
- The system is fully saturated (
actualCapacity <= 0), soincomingis not updated to the current segment, and - Both the last
completionsand lastcancelationsentries come entirely from buffered (late-committed) rows with old segment values.
That combination — saturation plus late rows on both queues in the same pass — is indeed uncommon, so the practical risk is lower than my initial description implied.
On the separate cooldown signal idea: it would cleanly decouple two concerns that are currently entangled:
- Cursor purpose: "where to start scanning next time" (skip tombstones, avoid re-processing)
- Cooldown purpose: "did we process real work recently enough to stay alive?"
A dedicated lastWorkTs: number (wall-clock timestamp) stored in internalState would make the intent obvious and eliminate the dependency on segment-to-timestamp conversion for cooldown. The trade-off is a small extra write to internalState on each work pass. Given that internalState is already being patched every work pass anyway, the overhead is negligible. That said, it's a design call best made by @ianmacartney since it touches the invariants the state machine relies on.
🧠 Learnings used
Learnt from: ianmacartney
Repo: get-convex/workpool PR: 170
File: src/component/danger.ts:36-43
Timestamp: 2026-02-19T08:37:48.834Z
Learning: In Convex (v1.31.0+), ctx.db.delete supports both two-argument form (ctx.db.delete(tableName, id)) and single-argument form (ctx.db.delete(id)). Both are valid; prefer the two-argument form for clarity when deleting by table and id, and use the single-argument form only when the intent is purely to delete by id without explicit table context. Ensure code consistently follows one form across a module or project where appropriate.
| .withIndex("segment", (q) => q.gte("segment", cancelationCursor)) | ||
| .take(CANCELLATION_BATCH_SIZE); | ||
| // Available slots after we process this batch's completions, plus 1 | ||
| // for the +1 trick (detect overflow vs. a future-scheduled retry). |
There was a problem hiding this comment.
we fetch one extra document below to know whether there is more work to be done, and also know the next time we'd need to be scheduled for.
| }, | ||
| }); | ||
| // Nothing found in snapshot. Re-read with a real dependency (same args | ||
| // for cache-hit efficiency) so a concurrent insert forces an OCC retry. |
There was a problem hiding this comment.
Since we are running the query through ctx.runQuery, it won't be cached, so cache-hit efficiency isn't applicable here. I also think coderabbit's comment above about using a stale running count when reusing queryArgs might be a real issue here
There was a problem hiding this comment.
Maybe didWork should also include whether handleRecovery did anything?
There was a problem hiding this comment.
yeah I keep thinking we'll add sub-query caching via ctx.runQuery but I think the index caching will be the solution there for the foreseeable future.
Agreed on the stale running count. I'm less sure about didWork for recovery - since didWork will cause main to re-execute immediately, whereas recovery is an async process, so the next iteration of main isn't expected to see anything different - only once it finishes
| // Advance cursors to skip tombstones on next scan. Only do this when | ||
| // we actually did work — the cursor doubles as the cooldown signal | ||
| // ("how long since we last processed something"). | ||
| const didWork = | ||
| completions.length > 0 || cancelations.length > 0 || pending.length > 0; | ||
| if (didWork) { | ||
| state.segmentCursors.completion = completions.at(-1)?.segment ?? segment; | ||
| state.segmentCursors.cancelation = | ||
| cancelations.at(-1)?.segment ?? segment; | ||
| if (pending.length > 0) { | ||
| state.segmentCursors.incoming = pending.at(-1)!.segment; | ||
| } else if (actualCapacity > 0) { | ||
| // We have no more pending work, update to now | ||
| state.segmentCursors.incoming = segment; | ||
| } |
There was a problem hiding this comment.
@coderabbitai can you clarify what you mean by this? what is a "no-work pass"?
| it("doesn't lose work when re-checking before going idle", async () => { | ||
| // Snapshot-then-confirm safety net: even if the snapshot shows no | ||
| // work, the runQuery confirmation should pick up data committed | ||
| // before this iteration started. |
There was a problem hiding this comment.
It looks like this test enqueues the work strictly before running the main loop, so I'm not sure if this is actually tested.
There was a problem hiding this comment.
good call - going to nix this test, I think it's redundant
|
|
||
| describe("snapshot semantics", () => { | ||
| it("the snapshot read does not see the calling mutation's pending writes", async () => { | ||
| // Verifies the prototype's distinguishing feature: |
There was a problem hiding this comment.
I think this is just testing that snapshotQuery works correctly in the convex-test environment? which might not be necessary to include
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@example/convex/test/scenarios/overhead.ts`:
- Around line 26-32: The insert call to ctx.db.insert("tasks", ...) uses an
unnecessary "as never" cast on workId; remove the cast so workId is simply
"overhead-test" (since the schema's vWorkId accepts strings), or if you
deliberately need to bypass types for this test replace with "as any"—update the
workId property in the ctx.db.insert(...) object accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: get-convex/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1bfe1ee6-986c-4606-8178-4f9defef7091
⛔ Files ignored due to path filters (2)
example/convex/_generated/api.d.tsis excluded by!**/_generated/**package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (7)
example/convex/convex.config.tsexample/convex/test/scenarios/overhead.tsexample/convex/test/scenarios/sustained.tsexample/convex/test/scenarios/throughput.tspackage.jsonsrc/component/loop.test.tssrc/component/loop.ts
✅ Files skipped from review due to trivial changes (1)
- package.json
🚧 Files skipped from review as they are similar to previous changes (2)
- src/component/loop.test.ts
- src/component/loop.ts
| await ctx.db.insert("tasks", { | ||
| runId: args.runId, | ||
| workId: "overhead-test" as never, | ||
| type: "mutation", | ||
| endTime: Date.now(), | ||
| enqueuedAt: args.enqueuedAt, | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the tasks table schema and workId field type expectations
ast-grep --pattern $'table("tasks", {
$$$
})'Repository: get-convex/workpool
Length of output: 45
🏁 Script executed:
#!/bin/bash
# Search for tasks table schema definition
rg -A 10 'table.*tasks' --type ts | head -50Repository: get-convex/workpool
Length of output: 45
🏁 Script executed:
#!/bin/bash
# Find schema files in the convex directory
find . -type f -name '*schema*' -o -name '*convex*' | grep -E '\.(ts|js)$' | head -20Repository: get-convex/workpool
Length of output: 183
🏁 Script executed:
#!/bin/bash
# Search for workId type definition or field usage
rg 'workId' --type ts -B 2 -A 2 | head -60Repository: get-convex/workpool
Length of output: 3186
🏁 Script executed:
#!/bin/bash
# Read the example/convex schema file
cat -n example/convex/schema.tsRepository: get-convex/workpool
Length of output: 953
🏁 Script executed:
#!/bin/bash
# Also check the main schema file
cat -n src/component/schema.ts | head -100Repository: get-convex/workpool
Length of output: 3837
🏁 Script executed:
#!/bin/bash
# Search for vWorkId definition
rg 'vWorkId' --type ts -B 3 -A 3Repository: get-convex/workpool
Length of output: 3301
🏁 Script executed:
#!/bin/bash
# Check what vWorkId is exported from the workpool package
grep -r 'export.*vWorkId' --include='*.ts' src/Repository: get-convex/workpool
Length of output: 137
Remove the unnecessary as never cast on workId.
The schema expects workId: vWorkId, which is a string validator that accepts any string value. The string literal "overhead-test" is valid without any cast. The as never cast is unconventional and unnecessarily suppresses type checking. Remove the cast entirely, or use as any if there's a deliberate reason to bypass type safety for test data.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@example/convex/test/scenarios/overhead.ts` around lines 26 - 32, The insert
call to ctx.db.insert("tasks", ...) uses an unnecessary "as never" cast on
workId; remove the cast so workId is simply "overhead-test" (since the schema's
vWorkId accepts strings), or if you deliberately need to bypass types for this
test replace with "as any"—update the workId property in the ctx.db.insert(...)
object accordingly.
ianmacartney
left a comment
There was a problem hiding this comment.
I'm also adding a stacked branch with a dashboard with a self-hosted scenario runner
| @@ -1,10 +1,12 @@ | |||
| import { defineApp } from "convex/server"; | |||
| import workpool from "@convex-dev/workpool/convex.config"; | |||
| import workpoolOld from "@convex-dev/workpool-old/convex.config"; | |||
There was a problem hiding this comment.
FYI I installed an old version of the workpool so we could do side-by-side comparisons
| "convex-helpers": "^0.1.94" | ||
| }, | ||
| "devDependencies": { | ||
| "@convex-dev/workpool-old": "npm:@convex-dev/workpool@0.4.6", |
There was a problem hiding this comment.
installed as an npm alias here
| // Advance cursors to skip tombstones on next scan. Only do this when | ||
| // we actually did work — the cursor doubles as the cooldown signal | ||
| // ("how long since we last processed something"). | ||
| const didWork = | ||
| completions.length > 0 || cancelations.length > 0 || pending.length > 0; | ||
| if (didWork) { | ||
| state.segmentCursors.completion = completions.at(-1)?.segment ?? segment; | ||
| state.segmentCursors.cancelation = | ||
| cancelations.at(-1)?.segment ?? segment; | ||
| if (pending.length > 0) { | ||
| state.segmentCursors.incoming = pending.at(-1)!.segment; | ||
| } else if (actualCapacity > 0) { | ||
| // We have no more pending work, update to now | ||
| state.segmentCursors.incoming = segment; | ||
| } |
There was a problem hiding this comment.
@coderabbitai it does seem like this could be an issue, but only in the rare case that incoming, completion, and cancelation are all rewound.
@ianmacartney I wonder if it would be simpler to keep track of the cooldown signal separately instead of using the cursors here?

TL;DR
Introduces a snapshot-then-confirm read pattern to eliminate OCC conflicts
What changed?
loop.ts:runSnapshotQuery(getPendingWork, ...)to fetch completions, cancelations, and pending starts without taking read dependencies, then confirms with a realctx.runQueryonly when the snapshot finds nothing to do.didWork), so the cursor doubles as a cooldown signal.maindirectly rather than going throughupdateRunStatus.kick.ts:kickMainLoopno longer returns a segment value. Callers that previously used the return value to setpendingStart/pendingCancelationsegments now callgetCurrentSegment()directly after the kick.lib.tsandcomplete.ts: Updated to match the newkickMainLoopvoid signature.loop.test.ts: Fully rewritten around observable behavior (work item status,runStatusstate, pending table contents) rather than internal implementation details like cursor positions and segment values. Tests are organized into focuseddescribeblocks: forward progress, capacity, retry, cancellation, lifecycle, saturated state, recovery, generation safety, snapshot semantics, and backwards compatibility.stateMachine.test.ts: Updated segment defaults fromgetNextSegment()togetCurrentSegment()so pending entries are immediately eligible in the same iteration, and added an explicit time advance between retry completion and the follow-uprunMaincall.How to test?
Run the existing test suite:
The rewritten
loop.test.tscovers the key behavioral contracts. Pay particular attention to thesnapshot semanticssuite, which directly verifies thatrunSnapshotQuerydoes not observe the calling mutation's own pending writes whilectx.runQuerydoes, and thebackwards compatibilitysuite, which confirms that legacysegment-bearing scheduled calls andupdateRunStatuscalls still route correctly after the upgrade.Why make this change?
The previous design split the main loop across two mutations (
main→updateRunStatus), which introduced unnecessary scheduling latency and complexity. More importantly, the loop's reads created OCC conflicts under concurrent load because every read of the pending tables took a dependency.The new design uses a snapshot read (no dependency) to do the bulk of the work, then takes a single dependency-bearing confirmation read only when the snapshot finds nothing — the common idle case. This eliminates the OCC retry storm under high concurrency while preserving correctness: if a concurrent transaction commits new work between the snapshot and the confirmation, the confirmation's read dependency causes a retry, ensuring no work is missed.