Perf/stage 2 adaptive daa by gg582 · Pull Request #4 · gosuda/bitcoin-rs

gg582 · 2026-05-25T15:10:45Z

Performance check for stage 2

Measured date: 2026-05-25
Branch: perf/stage-2-adaptive-daa

Review fix included in this measurement:
BlockLocalUtxoView now receives a reusable BlockLocalOverlay for one block apply.
The overlay is reset at the start of each validation pass, so script verification, coinbase maturity, and BIP68 checks still simulate independent block-local state.
The underlying HashMap allocation can retain capacity across those passes.
BlockPrevoutCache::lookup still avoids holding the RefCell borrow while UtxoSet::get_entry may take shard locks.

Node apply benchmark:
Measured path: apply::consensus_rule_tests::apply_block_persists_non_empty_filter_for_valid_same_block_spend
Binary: target/debug/deps/bitcoin_rs_node-362def564f321c5a
Command shape: run the compiled node lib-test binary repeatedly with the exact test filter and --quiet
Duration target: 60 seconds

Node apply result:
iterations=10495
elapsed_ns=59578545695
elapsed_s=59.578546
iter_per_s=176.154014

Interpretation:
This is a local fixed-duration throughput check of one apply_block regression path.
It exercises same-block spend handling, script verification through bitcoinconsensus, UTXO commit, and compact-filter generation.
It is not a full IBD benchmark and not a Criterion distribution.

Adaptive difficulty controller microbenchmark:
Harness location: tools/perf/adaptive-difficulty-harness
Harness commit policy: local-only benchmark harness, excluded by .git/info/exclude, do not commit
Measured path: DifficultyController::push_timestamp plus DifficultyController::next_target

Adaptive difficulty samples:
sample=1 iterations=512000000 elapsed_ns=15337535551 ns_per_iter=29.956 iter_per_sec=33382155.712
sample=2 iterations=512000000 elapsed_ns=14883090637 ns_per_iter=29.069 iter_per_sec=34401456.827
sample=3 iterations=512000000 elapsed_ns=14742131068 ns_per_iter=28.793 iter_per_sec=34730392.617

Adaptive difficulty summary:
samples=3
iterations_per_sample=512000000
mean_ns_per_iter=29.273
best_ns_per_iter=28.793
worst_ns_per_iter=29.956
mean_iter_per_sec=34171335.052
best_iter_per_sec=34730392.617
worst_iter_per_sec=33382155.712

…tion overlays - Implement fixed-point adaptive difficulty adjustment in crates/chain. - Reuse BlockLocalOverlay capacity across validation passes in crates/node to reduce allocations. - Add adaptive-difficulty-harness to tools/perf for local performance measurement. Verification: - cargo fmt --all -- --check - cargo test -p bitcoin-rs-chain adaptive_difficulty - cargo test -p bitcoin-rs-node apply:: --features bitcoinconsensus - cargo clippy -p bitcoin-rs-chain -p bitcoin-rs-node --features bitcoinconsensus

gemini-code-assist

Code Review

This pull request introduces a fixed-point adaptive difficulty adjustment mechanism and optimizes block validation performance by implementing a local UTXO caching layer (BlockPrevoutCache and BlockLocalOverlay). Feedback focuses on a critical continuity bug in the difficulty multiplier calculation where mixing base-2 and base-e approximations causes non-monotonic jumps. The review also recommends replacing the custom U256 implementation with a standard library, precomputing constants in the LSM slope calculation to avoid redundant work, and optimizing the cache to avoid expensive cloning of output data.

gemini-code-assist · 2026-05-25T15:14:13Z

+        let error = next_ewma - FIXED_ONE;
+        let shift_factor = error >> SCALE_BITS;
+        let fractional_part = error & (FIXED_ONE - 1);
+        let target_factor =
+            FIXED_ONE + fractional_part + ((fractional_part * fractional_part) >> (SCALE_BITS + 1));


The difficulty adjustment multiplier calculation contains a significant continuity bug. The code decomposes the error into a base-2 shift_factor and a fractional_part, but then applies a Taylor series approximation for $e^x$ ($1 + x + x^2/2$) to the fractional part.

Because $e^1 \approx 2.718$ while $2^1 = 2$, the multiplier will jump non-monotonically at integer boundaries of the error. For example, as error approaches $2^{16}-1$, the multiplier approaches $\approx 2.5$. When error reaches $2^{16}$, the shift_factor increments and fractional_part resets to 0, causing the multiplier to jump down to $2.0$. This will cause severe oscillations in difficulty.

If the intention is an ASERT-like DAA (base-2), use a continuous approximation for $2^f$ where $f \in [0, 1)$, such as a linear approximation ($1 + f$) or a specifically tuned polynomial that satisfies $poly(1) = 2 \cdot poly(0)$.

gemini-code-assist · 2026-05-25T15:14:13Z

+/// Little-endian 256-bit unsigned integer used by the adaptive target engine.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
+pub struct U256(pub [u64; U256_LIMBS]);
+
+impl U256 {
+    /// Returns zero.
+    #[must_use]
+    pub const fn zero() -> Self {
+        Self([0; U256_LIMBS])
+    }
+}


The custom U256 implementation is redundant and potentially fragile. The bitcoin-rs-chain crate already depends on the ruint workspace dependency, which provides a highly optimized and thoroughly tested Uint<256, 4>. Using a standard library for big integers is safer, especially for consensus-critical code like a DAA. The current Mul implementation is also a truncating multiplication which could overflow if used with larger targets or multipliers.

gemini-code-assist · 2026-05-25T15:14:13Z

+        let mut sum_x = 0_i64;
+        let mut sum_y = 0_i64;
+        let mut sum_xx = 0_i64;
+        let mut sum_xy = 0_i64;
+
+        let start_height = self
+            .total_blocks
+            .saturating_sub(u64::try_from(WINDOW_SIZE).unwrap_or(0));
+
+        for x in 0..n {
+            let actual_idx = usize::try_from(
+                start_height + u64::try_from(x).unwrap_or_else(|_| unreachable!("x fits u64")),
+            )
+            .unwrap_or_else(|_| unreachable!("usize supports required block index width"))
+                & WINDOW_MASK;
+            let y = i64::from(self.timestamps[actual_idx]);
+
+            sum_x += x;
+            sum_y += y;
+            sum_xx += x * x;
+            sum_xy += x * y;
+        }
+
+        let sum_x_squared = sum_x * sum_x;
+        let denominator = n * sum_xx - sum_x_squared;
+        if denominator == 0 {
+            return Some(FIXED_ONE);
+        }
+
+        let sum_xy_projection = sum_x * sum_y;
+        let numerator = n * sum_xy - sum_xy_projection;
+        let actual_slope = (numerator << SCALE_BITS) / denominator;
+        Some(actual_slope / TARGET_BLOCK_TIME)


In calculate_lsm_slope, the values for sum_x, sum_xx, and the resulting denominator are constant because WINDOW_SIZE is fixed and x always ranges from 0..WINDOW_SIZE. These should be precomputed as constants to avoid redundant calculations in the hot path. Additionally, performing the division by TARGET_BLOCK_TIME separately from the denominator division introduces unnecessary rounding error; it is more precise to divide by the product of both.

References

Avoid redundant calculations in performance-sensitive loops by precomputing constants. ^(link)

gemini-code-assist · 2026-05-25T15:14:13Z

+    fn lookup(&self, set: &UtxoSet, outpoint: bitcoin::OutPoint) -> Option<LiveOutput> {
+        if let Some(entry) = self.entries.borrow().get(&outpoint) {
+            return entry.clone();
+        }
+        let entry = set.get_entry(&internal_outpoint(&outpoint));
+        self.entries.borrow_mut().insert(outpoint, entry.clone());
+        entry
+    }
+}


The BlockPrevoutCache::lookup method clones the LiveOutput on every cache hit. Since LiveOutput contains a TxOut which includes an owned Vec<u8> for the script_pubkey, this can lead to significant overhead and memory pressure during block validation if scripts are large. Consider storing Rc<LiveOutput> or Arc<LiveOutput> in the cache to make hits cheap, or returning a reference where possible.

…-harness

…reuse

gg582 · 2026-05-26T11:42:51Z

Performance check for stage 2

Measured date: 2026-05-25
Branch: perf/stage-2-adaptive-daa

Review fix included in this measurement:
BlockLocalUtxoView now receives a reusable BlockLocalOverlay for one block apply.
The overlay is reset at the start of each validation pass, so script verification, coinbase maturity, and BIP68 checks still simulate independent block-local state.
The underlying HashMap allocation can retain capacity across those passes.
BlockPrevoutCache::lookup still avoids holding the RefCell borrow while UtxoSet::get_entry may take shard locks.

Node apply benchmark:
Measured path: apply::consensus_rule_tests::apply_block_persists_non_empty_filter_for_valid_same_block_spend
Binary: target/debug/deps/bitcoin_rs_node-362def564f321c5a
Command shape: run the compiled node lib-test binary repeatedly with the exact test filter and --quiet
Duration target: 60 seconds

Node apply result:
iterations=10495
elapsed_ns=59578545695
elapsed_s=59.578546
iter_per_s=176.154014

Interpretation:
This is a local fixed-duration throughput check of one apply_block regression path.
It exercises same-block spend handling, script verification through bitcoinconsensus, UTXO commit, and compact-filter generation.
It is not a full IBD benchmark and not a Criterion distribution.

Adaptive difficulty controller microbenchmark:
Harness location: tools/perf/adaptive-difficulty-harness
Harness commit policy: local-only benchmark harness, excluded by .git/info/exclude, do not commit
Measured path: DifficultyController::push_timestamp plus DifficultyController::next_target

Adaptive difficulty samples:
sample=1 iterations=512000000 elapsed_ns=15337535551 ns_per_iter=29.956 iter_per_sec=33382155.712
sample=2 iterations=512000000 elapsed_ns=14883090637 ns_per_iter=29.069 iter_per_sec=34401456.827
sample=3 iterations=512000000 elapsed_ns=14742131068 ns_per_iter=28.793 iter_per_sec=34730392.617

Adaptive difficulty summary:
samples=3
iterations_per_sample=512000000
mean_ns_per_iter=29.273
best_ns_per_iter=28.793
worst_ns_per_iter=29.956
mean_iter_per_sec=34171335.052
best_iter_per_sec=34730392.617
worst_iter_per_sec=33382155.712

metaphorics · 2026-06-02T15:14:24Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 75a8ad0dab

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-02T15:17:31Z

+const U256_BITS: usize = U256_LIMBS * LIMB_BITS;
+
+/// Little-endian 256-bit unsigned integer used by the adaptive target engine.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]


Implement numeric ordering for little-endian U256

Because U256 stores limbs little-endian, the derived Ord compares the least-significant limb first rather than the most-significant limb. For example, U256([u64::MAX, 0, 0, 0]) sorts greater than U256([0, 1, 0, 0]) even though it is numerically smaller; next_target relies on these comparisons for its min/max clamp, so targets that differ across a 64-bit limb can be bounded in the wrong direction.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-02T15:17:31Z

+        let error = next_ewma - FIXED_ONE;
+        let shift_factor = error >> SCALE_BITS;
+        let fractional_part = error & (FIXED_ONE - 1);
+        let target_factor =
+            FIXED_ONE + fractional_part + ((fractional_part * fractional_part) >> (SCALE_BITS + 1));


Fix negative fractional target scaling

When the EWMA is just below FIXED_ONE, error >> SCALE_BITS floors to -1 while error & (FIXED_ONE - 1) turns the negative fraction into a value near FIXED_ONE; for a window of 599-second blocks this produces roughly (2.5 / 2) * current_target, increasing the target even though faster blocks should make it harder. This makes the controller react in the wrong direction for common slightly-fast windows.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

This PR focuses on stage-2 performance work by reusing per-block UTXO lookup state during block application, and it introduces a new fixed-point adaptive difficulty controller along with a microbenchmark harness.

Changes:

Add bitcoin_rs_chain::adaptive_difficulty (fixed-point DifficultyController + U256) and export it from the chain crate.
Introduce a tools/perf/adaptive-difficulty-harness workspace crate to microbenchmark adaptive difficulty operations.
Rework node block-apply paths to reuse a BlockPrevoutCache and BlockLocalOverlay across script verification, coinbase maturity, BIP68 checks, and BIP158 filter generation.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`crates/node/src/state.rs`	Store reusable per-block caches in `NodeState` and propagate them through `ApplyHandles`.
`crates/node/src/apply.rs`	Reset/reuse prevout cache + overlay across validation passes; route filter generation through cached prevout lookups.
`crates/chain/src/lib.rs`	Export new `adaptive_difficulty` module.
`crates/chain/src/adaptive_difficulty.rs`	Add fixed-point adaptive difficulty primitives (`DifficultyController`, `U256`) and tests.
`tools/perf/adaptive-difficulty-harness/src/main.rs`	Add a standalone benchmark harness for adaptive difficulty hot-path timing.
`tools/perf/adaptive-difficulty-harness/Cargo.toml`	Define the harness crate and depend on `bitcoin-rs-chain`.
`Cargo.toml`	Add the harness crate to the workspace members list.
`Cargo.lock`	Include the new harness crate in the lockfile.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+/// Little-endian 256-bit unsigned integer used by the adaptive target engine.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
+pub struct U256(pub [u64; U256_LIMBS]);
+
+impl U256 {
+    /// Returns zero.
+    #[must_use]
+    pub const fn zero() -> Self {
+        Self([0; U256_LIMBS])
+    }
+}


    "crates/node",
    "bin/bitcoin-rs",
+    "tools/perf/adaptive-difficulty-harness",
 ]


+    handles.prevout_cache.reset();
+    handles.local_overlay.reset();


gg582 added 2 commits May 25, 2026 23:25

perf(node): cache prevout lookups during block apply

aeaaa15

gemini-code-assist Bot reviewed May 25, 2026

View reviewed changes

gg582 added 2 commits May 26, 2026 19:25

fix(perf): inherit upstream workspace metadata in adaptive-difficulty…

5ddb3e4

…-harness

perf(node): eliminate block apply heap allocation overhead via cache …

75a8ad0

…reuse

metaphorics requested a review from Copilot June 2, 2026 15:14

Copilot started reviewing on behalf of metaphorics June 2, 2026 15:14 View session

chatgpt-codex-connector Bot reviewed Jun 2, 2026

View reviewed changes

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Perf/stage 2 adaptive daa#4

Perf/stage 2 adaptive daa#4
gg582 wants to merge 4 commits into
mainfrom
perf/stage-2-adaptive-daa

gg582 commented May 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 25, 2026

Uh oh!

gemini-code-assist Bot May 25, 2026

Uh oh!

gemini-code-assist Bot May 25, 2026

Uh oh!

gemini-code-assist Bot May 25, 2026

Uh oh!

gg582 commented May 26, 2026

Uh oh!

metaphorics commented Jun 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 2, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		handles.prevout_cache.reset();
		handles.local_overlay.reset();

Uh oh!

Conversation

gg582 commented May 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

gg582 commented May 26, 2026

Uh oh!

metaphorics commented Jun 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants