Add database indexes for common stream query patterns by dolaoluwa574-source · Pull Request #616 · ritik4ever/stellar-stream

dolaoluwa574-source · 2026-06-26T13:04:23Z

Closes #361

Problem

Every filtered query against the streams table performs a full table scan. At small row counts this is invisible, but as the database grows the four most common access patterns become the dominant bottleneck:

Endpoint	Filter column
GET /api/streams?sender=… / GET /api/senders/:id/streams	sender
GET /api/streams?recipient=… / GET /api/recipients/:id/streams	recipient
GET /api/streams?status=…	canceled_at, completed_at, paused_at
Scheduled / active window queries	start_at

Note on range queries: measured wall-clock speedup for range predicates depends on result-set selectivity. When a range covers a large fraction of the table, the SQLite query planner may prefer a sequential scan regardless of index presence. EXPLAIN QUERY PLAN is the authoritative check — both range queries confirm USING INDEX in the output above.

Run the benchmark yourself:

npx ts-node scripts/benchmark-indexes.ts

Testing checklist

EXPLAIN QUERY PLAN output reviewed (see Verification section above)
npm run dev:backend starts without errors after up() is wired into db.ts
Existing API responses unchanged (indexes are transparent to callers)
scripts/benchmark-indexes.ts runs to completion with no assertion failures
Migration is idempotent: running it twice produces no error

No breaking changes

Indexes are read-only additions to the schema. They have no effect on API contracts, existing query results, or the Soroban contract layer.

Summary by CodeRabbit

Performance
- Improved stream-related query performance, especially for lookups by sender, recipient, status, and start time.
- Faster filtering and time-based browsing should make stream lists feel more responsive.
Chores
- Added a database migration to apply the new indexes safely and repeatedly.
- Added benchmarking checks to validate query speed improvements.

vercel · 2026-06-26T13:04:28Z

@dolaoluwa574-source is attempting to deploy a commit to the ritik4ever's projects Team on Vercel.

A member of the Team first needs to authorize it.

drips-wave · 2026-06-26T13:04:36Z

@dolaoluwa574-source Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

coderabbitai · 2026-06-26T13:04:37Z

📝 Walkthrough

Walkthrough

Adds SQLite indexes for common streams query patterns and a benchmark script that builds temporary databases, verifies planner output, measures cold-query timings, and enforces speedup checks.

Changes

SQLite stream indexes and benchmark

Layer / File(s)	Summary
Migration definitions and runner `backend/src/migrations/0002_add_stream_indexes.sql`, `backend/src/migrations/0002_add_stream_indexes.ts`	Defines the four `streams` indexes, exports `up(db)`, and adds the direct-run SQLite entrypoint.
Benchmark setup and synthetic database build `scripts/benchmark-indexes.ts`	Adds the benchmark script docs, imports, synthetic data generator, schema/index DDL, temp DB builder, timing helper, and query cases.
Benchmark execution and checks `scripts/benchmark-indexes.ts`	Runs `EXPLAIN QUERY PLAN`, benchmarks each query against both databases, enforces the speedup threshold, and deletes temporary files.

Sequence Diagram

sequenceDiagram
  participant main as main()
  participant buildDb as buildDb(withIndexes)
  participant coldBench as coldBench(dbPath, sql, params, iters)
  participant sqlite as better-sqlite3
  participant fs as fs

  main->>buildDb: create scan-only temp DB
  main->>buildDb: create indexed temp DB
  buildDb->>sqlite: create schema, insert 100,000 rows, add indexes
  main->>sqlite: EXPLAIN QUERY PLAN for each benchmark query
  main->>coldBench: time each query on both DBs
  coldBench->>sqlite: open read-only, prepare, execute, close
  main->>fs: delete temporary database files

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A bunny hopped by the SQLite tree,
and found new indexes glinting merrily.
“Sender, recipient, and start_at too—
now my little queries zoom right through!” 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	It is concise and accurately describes the main change: adding SQLite indexes for common stream query patterns.
Linked Issues check	✅ Passed	The PR adds all four requested indexes via migration and includes EXPLAIN QUERY PLAN plus benchmark checks for usage and speedup.
Out of Scope Changes check	✅ Passed	The benchmark script and migration are directly tied to the requested indexing work and do not introduce unrelated changes.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/benchmark-indexes.ts`:
- Around line 8-10: The script header documents a DB_PATH mode that is not
actually supported, so either implement reading process.env.DB_PATH in
benchmark-indexes.ts or remove that usage example from the comment. Update the
script entry path handling around the benchmark-indexes.ts setup so the
documented invocation matches the behavior, and keep the usage text aligned with
the actual CLI options.
- Around line 87-111: The benchmark setup in buildDb and the transaction that
populates streams should use the same synthetic rows for both database variants
instead of generating fresh random start_at and total_amount values per call.
Extract row generation from the insertion loop into a shared dataset, then have
buildDb(false) and buildDb(true) insert that identical row set so the benchmark
in scripts/benchmark-indexes.ts compares indexes on vs off against the same
data.
- Around line 175-186: The EXPLAIN QUERY PLAN section in benchmark-indexes.ts
only prints planner output and then later reports success unconditionally, so
add a real assertion in the indexed DB loop by parsing the returned `detail`
rows from `db.prepare(...).all(...)` and verifying each indexed query shows the
expected `idx_streams_*` index name, or at minimum `SEARCH streams USING INDEX`,
before reaching the success summary. Use the existing `QUERIES` loop and
`EXPLAIN QUERY PLAN` output handling to fail fast when a query falls back to
`SCAN streams`, and only print the “all indexes are used” message after those
checks pass.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae8c1d9a-2080-4979-b907-12fb7fd50887

📥 Commits

Reviewing files that changed from the base of the PR and between 47bb804 and 87c480a.

📒 Files selected for processing (3)

backend/src/migrations/0002_add_stream_indexes.sql
backend/src/migrations/0002_add_stream_indexes.ts
scripts/benchmark-indexes.ts

coderabbitai · 2026-06-26T13:09:42Z

+ * Usage (from repo root):
+ *   npx ts-node scripts/benchmark-indexes.ts
+ *   DB_PATH=backend/data/streams.db npx ts-node scripts/benchmark-indexes.ts


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Remove or implement the documented DB_PATH mode.

The header says DB_PATH=backend/data/streams.db npx ts-node scripts/benchmark-indexes.ts, but the script never reads process.env.DB_PATH. That usage string is misleading as written.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/benchmark-indexes.ts` around lines 8 - 10, The script header documents a DB_PATH mode that is not actually supported, so either implement reading process.env.DB_PATH in benchmark-indexes.ts or remove that usage example from the comment. Update the script entry path handling around the benchmark-indexes.ts setup so the documented invocation matches the behavior, and keep the usage text aligned with the actual CLI options.

coderabbitai · 2026-06-26T13:09:42Z

+  const now = Math.floor(Date.now() / 1000);
+  const insert = db.prepare(`
+    INSERT INTO streams
+      (id, sender, recipient, asset_code, total_amount,
+       start_at, duration_sec, canceled_at, completed_at, paused_at)
+    VALUES (?,?,?,?,?,?,?,?,?,?)
+  `);
+
+  db.transaction(() => {
+    for (let i = 0; i < ROW_COUNT; i++) {
+      const ca = i % 20 === 0 ? now - 3600 : null;
+      const cp = ca === null && i % 15 === 0 ? now - 1800 : null;
+      const pa = ca === null && cp === null && i % 30 === 0 ? now - 600 : null;
+      insert.run(
+        `s${String(i).padStart(7, "0")}`,
+        stellarId("U", "0", i),      // unique sender per row (high cardinality)
+        stellarId("R", "0", i),      // unique recipient per row
+        ASSETS[i % ASSETS.length],
+        Math.round(Math.random() * 10_000 * 100) / 100,
+        now - Math.floor(Math.random() * 30 * 86_400),
+        3_600,
+        ca, cp, pa
+      );
+    }
+  })();


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Benchmark both databases from the same synthetic dataset.

buildDb(false) and buildDb(true) each generate fresh random start_at/total_amount values, so the benchmark is comparing different tables instead of isolating “indexes on vs off”. That makes the measured speedups noisy and can hide regressions or create false wins, especially for the range query. Generate the rows once and load the identical row set into both databases.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/benchmark-indexes.ts` around lines 87 - 111, The benchmark setup in buildDb and the transaction that populates streams should use the same synthetic rows for both database variants instead of generating fresh random start_at and total_amount values per call. Extract row generation from the insertion loop into a shared dataset, then have buildDb(false) and buildDb(true) insert that identical row set so the benchmark in scripts/benchmark-indexes.ts compares indexes on vs off against the same data.

coderabbitai · 2026-06-26T13:09:42Z

+  // ── EXPLAIN QUERY PLAN ──────────────────────────────────────────────────
+  console.log("─".repeat(70));
+  console.log("EXPLAIN QUERY PLAN (indexed DB)");
+  console.log("─".repeat(70));
+  {
+    const db = new Database(idxDb, { readonly: true });
+    for (const q of QUERIES) {
+      const plan = db.prepare(`EXPLAIN QUERY PLAN ${q.sql}`).all(...q.params) as Array<{ detail: string }>;
+      console.log(`\n  ▸ ${q.label}`);
+      for (const row of plan) console.log(`    ${row.detail}`);
+    }
+    db.close();


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Fail when EXPLAIN QUERY PLAN does not show the expected index.

Right now this only prints the planner output, then Line 223 unconditionally says all four indexes are used. If SQLite falls back to SCAN streams, the script still reports success as long as the equality timing check passes. Parse the detail rows and assert the expected idx_streams_* name (or at least SEARCH streams USING INDEX) for each indexed case before printing the success summary.

Also applies to: 223-226

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/benchmark-indexes.ts` around lines 175 - 186, The EXPLAIN QUERY PLAN section in benchmark-indexes.ts only prints planner output and then later reports success unconditionally, so add a real assertion in the indexed DB loop by parsing the returned `detail` rows from `db.prepare(...).all(...)` and verifying each indexed query shows the expected `idx_streams_*` index name, or at minimum `SEARCH streams USING INDEX`, before reaching the success summary. Use the existing `QUERIES` loop and `EXPLAIN QUERY PLAN` output handling to fail fast when a query falls back to `SCAN streams`, and only print the “all indexes are used” message after those checks pass.

Add database indexes for common stream query patterns

87c480a

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add database indexes for common stream query patterns#616

Add database indexes for common stream query patterns#616
dolaoluwa574-source wants to merge 1 commit into
ritik4ever:mainfrom
dolaoluwa574-source:Add-SQLite-indexes-for-common-query-patterns

dolaoluwa574-source commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented Jun 26, 2026

Uh oh!

drips-wave Bot commented Jun 26, 2026

Uh oh!

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Uh oh!

coderabbitai Bot Jun 26, 2026

Uh oh!

coderabbitai Bot Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dolaoluwa574-source commented Jun 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Testing checklist

No breaking changes

Summary by CodeRabbit

Uh oh!

vercel Bot commented Jun 26, 2026

Uh oh!

drips-wave Bot commented Jun 26, 2026

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dolaoluwa574-source commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading