perf: reduce DB bandwidth ~1.5 TB/day by sethconvex · Pull Request #572 · openclaw/clawhub

sethconvex · 2026-03-01T19:01:36Z

Summary

Production DB reads are ~2 TB/day. This PR reduces bandwidth by ~75% (~1.5 TB/day) through indexed queries, denormalization, and query restructuring.

Phase 1: Add by_moderation compound index and rewrite 8 cron query functions from .filter().collect() full table scans to .withIndex() (~6 GB/day saved)
Phase 2: Denormalize isSuspicious onto skills table with 6 compound indexes so listPublicPageV2 filters at the index level instead of paginating the entire table (~1 TB/day saved)
Phase 3: Add latestVersionSummary denormalization to avoid reading full ~6.4 KB skillVersions docs on list pages (~500 GB/day saved)
Phase 4: Split trending leaderboard query to one day at a time (avoids 32K doc limit). Reduce global stats recount from hourly to daily (~400 MB/day saved)
Phase 5: Add "Convex Query & Bandwidth Rules" section to AGENTS.md

Post-deploy steps

Run backfills after deploying:

bunx convex run maintenance:backfillIsSuspiciousInternal
bunx convex run maintenance:backfillLatestVersionSummaryInternal

No client changes required — all new schema fields are v.optional() and query return shapes are unchanged.

Test plan

bunx convex typecheck — no new type errors (26 pre-existing in unrelated files)
bun run test — all 604 tests pass
Deploy to dev, run backfills
Verify no bytesReadLimit/documentsReadLimit errors in Convex insights
Browse site on dev — list pages, filters, trending all work
Deploy to prod, run backfills, monitor bandwidth for 24h

🤖 Generated with Claude Code

…and query rewrites Phase 1: Add `by_moderation` compound index and rewrite 8 cron query functions to use `.withIndex()` instead of `.filter().collect()` full table scans (~6 GB/day saved). Phase 2: Denormalize `isSuspicious` onto skills table with 6 compound indexes so `listPublicPageV2` can filter at the index level instead of paginating the entire table (~1 TB/day saved). Includes backfill mutation and write-path updates across all moderation mutations. Phase 3: Add `latestVersionSummary` denormalization to avoid reading full ~6.4 KB `skillVersions` docs on list pages (~500 GB/day saved). Phase 4: Split trending leaderboard query to one day at a time to stay under 32K doc limit. Reduce global stats recount from hourly to daily since delta tracking handles real-time accuracy (~400 MB/day saved). Phase 5: Add "Convex Query & Bandwidth Rules" section to AGENTS.md. Backfill commands (run after deploy): bunx convex run maintenance:backfillIsSuspiciousInternal bunx convex run maintenance:backfillLatestVersionSummaryInternal Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-03-01T19:01:40Z

@sethconvex is attempting to deploy a commit to the Amantus Machina Team on Vercel.

A member of the Team first needs to authorize it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53f901a40e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-01T19:07:19Z

convex/skills.ts

+    // Graceful backfill handling: during backfill, isSuspicious is undefined for
+    // un-patched docs, which won't match eq('isSuspicious', false). If the indexed
+    // query returns an unexpectedly empty first page, retry with the old path + JS filter.
+    if (useNonsuspiciousIndex && filteredPage.length === 0 && !result.isDone && !initialCursor) {


Remove impossible guard from non-suspicious fallback

The backfill fallback for nonSuspiciousOnly is gated by !result.isDone, but when the new eq('isSuspicious', false) index has no rows (the common pre-backfill state), Convex returns page=[] with isDone=true. That means this branch never runs and listPublicPageV2 returns an empty page even though many clean skills still exist with isSuspicious unset, which can blank first-page non-suspicious views until the backfill completes.

Useful? React with 👍 / 👎.

Fixed in e352123 and aefaf1c — removed the \!result.isDone guard and made the fallback fire on any page (not just first) so cursors stay on the same index path.

chatgpt-codex-connector · 2026-03-01T19:07:19Z

convex/skills.ts

+      const latestVersion = hasSummary
+        ? toPublicSkillListVersionFromSummary(skill.latestVersionSummary!, skill.latestVersionId)
+        : toPublicSkillListVersion(latestVersionDoc)


Keep version summary and latestVersionId consistent

This path now prefers skill.latestVersionSummary over loading skill.latestVersionId, but other mutations can retarget latestVersionId without rewriting the summary (for example updateTags in convex/skills.ts updates latestVersionId only). After a retag of latest, list responses can therefore return _id for one version while version/changelog still come from the previous summary.

Useful? React with 👍 / 👎.

Fixed in e352123 — updateTags now writes latestVersionSummary when the latest tag is repointed to a different version.

chatgpt-codex-connector · 2026-03-01T19:07:19Z

convex/skills.ts

+        .withIndex('by_moderation', (q) =>
+          q.eq('moderationStatus', 'active').eq('moderationReason', 'scanner.vt.pending'),
+        )
+        .take(limit),


Avoid pre-limiting stale-reason candidates before VT check

Each moderation-reason bucket is truncated with .take(limit) before verifying whether the referenced version has vtAnalysis, so the query can return fewer than limit even when additional stale rows exist beyond that first slice. Because syncModerationReasons treats a short page as completion, this can cause the sync job to stop early and leave stale moderation reasons unsynced.

Useful? React with 👍 / 👎.

Fixed in e352123 — each bucket now takes limit * 2 to account for rows filtered out by the VT check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

greptile-apps · 2026-03-01T19:09:30Z

Greptile Summary

This PR delivers a multi-phase bandwidth reduction (~1.5 TB/day) across the Convex backend by converting full-table scans to indexed queries, denormalizing two computed fields (isSuspicious, latestVersionSummary) onto the skills table, splitting the trending leaderboard query to avoid the 32K-doc limit, and reducing the global stats cron to daily.

Key changes:

Schema: adds isSuspicious (boolean) and latestVersionSummary (nested object) optional fields plus seven new compound indexes on skills.
listPublicPageV2: uses new by_nonsuspicious_* indexes when nonSuspiciousOnly is set; includes a graceful fallback to the old full-scan path during backfill.
8 cron query functions: rewritten from .filter().collect() to .withIndex() using the new by_moderation compound index.
buildPublicSkillEntries: skips the full ~6 KB skillVersions doc read when latestVersionSummary is present on the skill.
Leaderboard: splits 7-day range .collect() into one equality query per day to stay under the 32K doc limit.
Backfills: two cursor-paginated self-scheduling mutations (backfillIsSuspiciousInternal, backfillLatestVersionSummaryInternal) to populate existing rows.
All isSuspicious mutation sites appear correctly updated to call computeIsSuspicious when moderationFlags or moderationReason is written.

Confidence Score: 4/5

Safe to merge with the recommended post-deploy backfills; no client-breaking changes and the backfill fallback guards the public list page during the transition window.
All new schema fields are optional, queries are correctly rewritten with proper index definitions, and mutation sites that write moderationFlags/moderationReason have been updated to keep isSuspicious in sync. The only non-critical concerns are a minor _creationTime approximation in toPublicSkillListVersionFromSummary, serial (vs parallel) daily leaderboard queries, and the backfill skip-if-present guard. No logic errors or data-loss risks identified.
convex/skills.ts — the toPublicSkillListVersionFromSummary function uses summary.createdAt for _creationTime; and the pagination fallback for the nonsuspicious index only activates on the first page, so clients paginating during backfill may see unexpectedly empty continuation pages.

_{Last reviewed commit: 417012d}

greptile-apps

_{7 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-03-01T19:09:34Z

convex/skills.ts

+  if (!latestVersionId) return null
+  return {
+    _id: latestVersionId,
+    _creationTime: summary.createdAt,


_creationTime sourced from createdAt, not the real Convex field

_creationTime is Convex's internal insertion timestamp (set automatically by the DB), while createdAt is the application-set field stored in latestVersionSummary. In practice these are nearly identical since both are set to now during the same transaction, but they are semantically distinct fields — toPublicSkillListVersion correctly reads version._creationTime, whereas this path substitutes summary.createdAt.

Suggested change

_creationTime: summary.createdAt,

_creationTime: summary.createdAt, // approximates _creationTime (see: version.createdAt ≈ version._creationTime)

No behavioral impact expected, but worth a comment to make the approximation explicit and prevent future confusion if the two ever diverge.

Prompt To Fix With AI

This is a comment left during a code review. Path: convex/skills.ts Line: 617 Comment: **`_creationTime` sourced from `createdAt`, not the real Convex field** `_creationTime` is Convex's internal insertion timestamp (set automatically by the DB), while `createdAt` is the application-set field stored in `latestVersionSummary`. In practice these are nearly identical since both are set to `now` during the same transaction, but they are semantically distinct fields — `toPublicSkillListVersion` correctly reads `version._creationTime`, whereas this path substitutes `summary.createdAt`. ```suggestion _creationTime: summary.createdAt, // approximates _creationTime (see: version.createdAt ≈ version._creationTime) ``` No behavioral impact expected, but worth a comment to make the approximation explicit and prevent future confusion if the two ever diverge. How can I resolve this? If you propose a fix, please make it concise.

Fixed in e352123 — added clarifying comment noting the approximation.

greptile-apps · 2026-03-01T19:09:35Z

convex/lib/leaderboards.ts

+  for (let day = startDay; day <= endDay; day++) {
+    const rows = await ctx.db
+      .query('skillDailyStats')
+      .withIndex('by_day', (q) => q.eq('day', day))
+      .collect()
+    for (const row of rows) {
+      const current = totals.get(row.skillId) ?? { installs: 0, downloads: 0 }
+      current.installs += row.installs
+      current.downloads += row.downloads
+      totals.set(row.skillId, current)
+    }
  }


Sequential daily queries could be parallelized

The 7 daily queries are issued serially with await inside the for loop, adding 7 round-trips instead of 1. Since there are no cross-day dependencies, these can be parallelized with Promise.all for better latency in the cron job:

Suggested change

for (let day = startDay; day <= endDay; day++) {

const rows = await ctx.db

.query('skillDailyStats')

.withIndex('by_day', (q) => q.eq('day', day))

.collect()

for (const row of rows) {

const current = totals.get(row.skillId) ?? { installs: 0, downloads: 0 }

current.installs += row.installs

current.downloads += row.downloads

totals.set(row.skillId, current)

}

}

const dayKeys = Array.from({ length: endDay - startDay + 1 }, (_, i) => startDay + i)

const perDayRows = await Promise.all(

dayKeys.map((day) =>

ctx.db

.query('skillDailyStats')

.withIndex('by_day', (q) => q.eq('day', day))

.collect(),

),

)

for (const rows of perDayRows) {

for (const row of rows) {

const current = totals.get(row.skillId) ?? { installs: 0, downloads: 0 }

current.installs += row.installs

current.downloads += row.downloads

totals.set(row.skillId, current)

}

}

This is low-priority for a cron job, but it's an easy win.

Prompt To Fix With AI

This is a comment left during a code review. Path: convex/lib/leaderboards.ts Line: 34-45 Comment: **Sequential daily queries could be parallelized** The 7 daily queries are issued serially with `await` inside the `for` loop, adding 7 round-trips instead of 1. Since there are no cross-day dependencies, these can be parallelized with `Promise.all` for better latency in the cron job: ```suggestion const dayKeys = Array.from({ length: endDay - startDay + 1 }, (_, i) => startDay + i) const perDayRows = await Promise.all( dayKeys.map((day) => ctx.db .query('skillDailyStats') .withIndex('by_day', (q) => q.eq('day', day)) .collect(), ), ) for (const rows of perDayRows) { for (const row of rows) { const current = totals.get(row.skillId) ?? { installs: 0, downloads: 0 } current.installs += row.installs current.downloads += row.downloads totals.set(row.skillId, current) } } ``` This is low-priority for a cron job, but it's an easy win. How can I resolve this? If you propose a fix, please make it concise.

Fixed in e352123 — parallelized with Promise.all.

greptile-apps · 2026-03-01T19:09:36Z

convex/maintenance.ts

+
+    let patched = 0
+    for (const skill of page) {
+      if (skill.latestVersionSummary) continue


Backfill skips skills with existing latestVersionSummary

The guard if (skill.latestVersionSummary) continue means any skill that already has a summary (e.g. one published after this PR is deployed but before the backfill completes) will be silently skipped and never re-evaluated by the backfill. This is the intended behavior for an initial one-shot populate, but it means if a summary was somehow written with wrong data, re-running the backfill won't fix it. Worth noting in the JSDoc so future operators know to clear latestVersionSummary first if a re-backfill is ever needed.

Prompt To Fix With AI

This is a comment left during a code review. Path: convex/maintenance.ts Line: 1551 Comment: **Backfill skips skills with existing `latestVersionSummary`** The guard `if (skill.latestVersionSummary) continue` means any skill that already has a summary (e.g. one published after this PR is deployed but before the backfill completes) will be silently skipped and never re-evaluated by the backfill. This is the intended behavior for an initial one-shot populate, but it means if a summary was somehow written with wrong data, re-running the backfill won't fix it. Worth noting in the JSDoc so future operators know to clear `latestVersionSummary` first if a re-backfill is ever needed. How can I resolve this? If you propose a fix, please make it concise.

Fixed in e352123 — backfill now reconciles against the current latestVersionId instead of skipping. Updated JSDoc accordingly.

- Replace ArkType `[inferred]` type alias for ClawdisSkillMetadata with an explicit interface so TS can see envVars/dependencies/author/links - Extract listBySkillHandler from comments.ts so tests can call it directly without accessing private _handler property - Rebuild packages/schema dist output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b904504176

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-01T19:21:56Z

convex/maintenance.ts

+
+    let patched = 0
+    for (const skill of page) {
+      if (skill.latestVersionSummary) continue


Reconcile existing summaries during latest-version backfill

backfillLatestVersionSummaryInternal skips any row that already has latestVersionSummary, so it cannot repair skills where latestVersionId was later repointed (for example by tag updates) and the denormalized summary is now stale. In that scenario list responses can keep returning _id for one version and summary fields from another indefinitely because this backfill will never patch those rows; the migration should validate/rewrite summary data against the current latestVersionId instead of unconditionally continuing.

Useful? React with 👍 / 👎.

Fixed in e352123 — same fix as above, backfill always re-evaluates against current latestVersionId.

The explicit ClawdisSkillMetadata interface (needed because ArkType's [inferred] doesn't resolve all fields) now has a keyof-based type guard that triggers a compile error if the interface keys drift from the schema. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix P1: remove !result.isDone guard from listPublicPageV2 backfill fallback so it fires when the nonsuspicious index is empty (isDone=true) - Fix updateTags to update latestVersionSummary when repointing latest tag - Parallelize leaderboard daily queries with Promise.all - Over-fetch stale-reason candidates (2x limit) before VT filtering - Reconcile existing latestVersionSummary in backfill instead of skipping - Add _creationTime approximation comment - Rebuild schema dist to include author field on ClawdisSkillMetadata Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e3521239b6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-01T22:24:17Z

convex/skills.ts

+    // query returns an unexpectedly empty first page, retry with the old path + JS filter.
+    // Note: when the index has no matching rows, Convex returns isDone=true, so we
+    // intentionally omit the !result.isDone guard here to ensure the fallback fires.
+    if (useNonsuspiciousIndex && filteredPage.length === 0 && !initialCursor) {


Keep pagination cursor on the same index path

When nonSuspiciousOnly hits the first-page fallback (!initialCursor), this branch switches to fallbackPaginate and returns a cursor generated from SORT_INDEXES. On the next page request, runPaginate goes back to NONSUSPICIOUS_SORT_INDEXES, so that cursor is incompatible, paginateWithStaleCursorRecovery resets to page 1 (runPaginate(null)), and users can no longer paginate past the first page during backfill windows where isSuspicious is still mostly unset.

Useful? React with 👍 / 👎.

Fixed in aefaf1c — fallback now fires on any page and reuses the client cursor via paginateWithStaleCursorRecovery, keeping cursors on the same SORT_INDEXES path across paginations.

The nonsuspicious index fallback now fires on any page (not just the first) and reuses the client's cursor via stale-cursor recovery. This prevents pagination from breaking when a SORT_INDEXES cursor is sent back to the NONSUSPICIOUS_SORT_INDEXES path on page 2+. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aefaf1c798

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-01T22:37:59Z

convex/skills.ts

+    // un-patched docs, which won't match eq('isSuspicious', false). Fall back to
+    // SORT_INDEXES + JS filter. This fires on ANY page (not just the first) so that
+    // cursors stay on the same index path across client paginations.
+    if (useNonsuspiciousIndex && filteredPage.length === 0) {


Fall back when non-suspicious index page is underfilled

In listPublicPageV2, the backfill fallback only runs when filteredPage.length === 0, but during isSuspicious backfill many non-suspicious rows still have isSuspicious unset and are therefore absent from the eq('isSuspicious', false) index. If a page contains even a few indexed rows, this branch is skipped, so the query can return a partial page and even isDone: true while additional eligible rows still exist outside that index slice; clients will stop paginating and silently miss valid results until backfill completes.

Useful? React with 👍 / 👎.

Acknowledged — this is a known transient edge case during the backfill window. The partial pages contain correct data (no false results), just fewer items than expected. The backfill self-schedules and completes in minutes, after which the index is fully populated and pages are full-sized. We considered always falling back when underfilled, but that would negate the index benefit for any page that happens to have fewer results than numItems (e.g. the last page). The brief underfill during backfill is an acceptable trade-off vs. adding complexity for a one-time migration.

docs: tell agents to check Convex insights before writing queries

49e2bf4

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector bot reviewed Mar 1, 2026

View reviewed changes

docs: use convex CLI for insights instead of MCP

417012d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

greptile-apps bot reviewed Mar 1, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 1, 2026

View reviewed changes

sethconvex and others added 2 commits March 1, 2026 11:22

chatgpt-codex-connector bot reviewed Mar 1, 2026

View reviewed changes

	_creationTime: summary.createdAt,
	_creationTime: summary.createdAt, // approximates _creationTime (see: version.createdAt ≈ version._creationTime)

Uh oh!

Conversation

sethconvex commented Mar 1, 2026

Summary

Post-deploy steps

Test plan

Uh oh!

vercel bot commented Mar 1, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 1, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels