Undo stale block entries during reorgs & related optimizations #174

shesek · 2025-12-01T04:46:14Z

This PR implements reorg handling that actively removes DB entries originating from stale blocks as the reorg occurs instead of discarding them at read time, plus several optimizations made possible by it. Some notes:

The original motivation for this PR - optimizing TxHistory lookups - already made a DB reindex/migration necessary (to cleanup stale entries), so I ended up bundling in a couple further related optimizations that required DB schema changes, for the TxEdge and TxConf indexes.
This implementation makes the distinction that entries created by stale blocks in the history DB should get undone during reorgs, while txstore DB entries are kept.

This is partially because txstore entries from stale blocks could be useful for archival purposes (they're not all fully available through the APIs, but could be made to be) and are cheap to keep around (not many and not in the way of history index lookups), but more importantly also because non-unique keys cannot be 'undone' by deletion when handling reorgs (and keeping them in the txstore seemed easier than making an exception for them).

Because of this (and because a migration was already needed..), the address prefix search index (A) was moved to the txstore DB, while the confirmations index (C) was moved to the history DB.
Care was taken to ensure consistency of the public APIs - it is never possible for blocks to be visible (e.g. in GET /blocks/tip or GET /block/:hash) without the corresponding history entries from those blocks being indexed and visible too (e.g. in GET /address/:address/txs or GET /tx/:txid/:vout/outspend), or vice-versa.

The consistency is guaranteed by ensuring that in-progress DB writes always refer to heights that don't yet exists (or were removed) from the HeaderList, which keeps partial in-progress writes publicly 'invisible' until processing is completed. This also means that the visible chain tip will temporarily drop down to the common ancestor while the reorg is being processed, which was not the case previously.

(Preserving both a monotonic chain tip and full consistency under the new design is possible but would require holding an exclusive lock on the HeaderList for the entire reorg processing duration, which seems undesirable.)
The DB schema version was bumped from 1 to 2, with a migration script available as a db-migrate-v1-to-v2 binary. Example use:

cargo run --bin db-migrate-v1-to-v2 -- -vvv --network mainnet --db-dir db

Migration of Elements DBs is supported too, using --features liquid.

It makes more sense there, since it doesn't depend on any of the data added to the txstore in the first `add` stage. And it needs to be there, for the followup commit that assumes all entries in the history db can be safely deleted when undoing blocks.

Prior to this change, history index entries created by stale blocks would remain in the history DB and only get discarded at read time. This change explicitly removes history entries when a reorg occurs, so we can assume all indexed entries correspond to blocks currently still part of the best chain. This enables optimizing some db lookups (in the followup commits), since readers no longer need to account for stale entries. (Note schema.md was only corrected to match the existing schema, 'D' rows were already being kept for both the history and txstore dbs.)

Iterating history db entries now involves a single sequential db scan (plus reads into the in-memory HeaderList), without the per-tx random access db reads that were previously needed to verify confirmation status.

Changed from an index of `txid -> Set<blockhash>` to `txid -> blockheight` - Instead of a list of blocks seen to include the txid (including stale blocks), map the txid directly to the single block that confirmed it and is still part of the best chain. - Identify blocks by their height instead of their hash. Previously it was necessary to keep the hash to ensure it is still part of the best chain, but now we can assume that it is. - Move the index from the txstore db to the history db, so that its entries will get undone during reorgs.

Changed from an index of `funding_txid:vout -> Set<spending_txid:vin>` to `funding_txid:vout -> spending_txid:vin||spending_height` - Instead of a list of inputs seen to spend the outpoint, map the outpoint directly to the single spending input that is still part of the best chain. - Keep the height of the spending transaction, too. This reduces the number of db reads per spend lookup from 2 to 1.

Now possible with the V2 schema, since the exact TxEdge row key can be derived from the funding_txid:vout alone (previously the key also included the spending_txid, requiring a prefix scan for each lookup).

Now possible with the V2 schema, since the exact TxConf row key can be derived from the txid alone (previously the key also included the block, requiring a prefix scan for each lookup). This isn't used anywhere yet, but will be used in a followup commit for the DB migration script (and could potentially be used for a new public API endpoint). Exposed as a standalone function so that it can be used directly with a `DB`, without having to construct the full `ChainQuery` with a `Daemon`.

- Change lookup_txns to use MultiGet - Use lookup_txns for block transactions and reconstruction too (GET /block/:hash/txs and GET /block/:hash/raw) (This was already possible with the V1 schema, but related to and builds upon the other V2 changes.) Plus some related changes: - Remove expensive sanity check assertion in lookup_txn (involved txid computation and wasn't really necessary) - Add test for raw block reconstruction

Previously each key/value read during iteration was getting duplicated 😮 (This doesn't strictly belong to the PR its included in, but it will greatly benefit the DB migration script.)

RCasatta · 2025-12-01T14:36:24Z

I still need to review the code... After reading the description I wonder if you considered to move to a single db with multiple column families since we are migrating

Randy808 · 2025-12-03T16:29:44Z

src/new_index/query.rs

-    pub fn lookup_tx_spends(&self, tx: Transaction) -> Vec<Option<SpendingInput>> {
+    pub fn lookup_tx_spends(&self, tx: &Transaction) -> Vec<Option<SpendingInput>> {
        let txid = tx.compute_txid();
+        let outpoints = tx


nit: Consider creating optional outpoints for every output at the start (where None = unspendable) to avoid duplicate allocation when re-creating outpoints for spendable outputs

shesek added 3 commits November 30, 2025 22:02

Optimize address TxHistory lookups for tx history, stats and UTXOs

c74334e

Iterating history db entries now involves a single sequential db scan (plus reads into the in-memory HeaderList), without the per-tx random access db reads that were previously needed to verify confirmation status.

shesek force-pushed the 202511-undo-reorgs branch from 6b04e50 to 992d996 Compare December 1, 2025 05:06

shesek added 9 commits December 1, 2025 07:39

Add tests for reorg scenarios

1579812

Fix reorg crash recovery when there are >100 reorged blocks

050f2be

Implement multi-outpoint TxEdge lookup using MultiGet

28e48a0

Now possible with the V2 schema, since the exact TxEdge row key can be derived from the funding_txid:vout alone (previously the key also included the spending_txid, requiring a prefix scan for each lookup).

Bump DB version, add DB migration script

056abdf

Optimize ScanIterator to avoid unnecessary copying

ae23bc5

Previously each key/value read during iteration was getting duplicated 😮 (This doesn't strictly belong to the PR its included in, but it will greatly benefit the DB migration script.)

shesek force-pushed the 202511-undo-reorgs branch from 992d996 to ae23bc5 Compare December 1, 2025 05:39

shesek mentioned this pull request Dec 1, 2025

Optimize history queries and add reorg cleanup #170

Closed

shesek requested review from RCasatta December 1, 2025 06:00

Randy808 reviewed Dec 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Undo stale block entries during reorgs & related optimizations #174

Undo stale block entries during reorgs & related optimizations #174

shesek commented Dec 1, 2025

Uh oh!

RCasatta commented Dec 1, 2025

Uh oh!

Randy808 Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Undo stale block entries during reorgs & related optimizations #174

Are you sure you want to change the base?

Undo stale block entries during reorgs & related optimizations #174

Conversation

shesek commented Dec 1, 2025

Uh oh!

RCasatta commented Dec 1, 2025

Uh oh!

Randy808 Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants