feat: add sliding block history to executor#1047
feat: add sliding block history to executor#1047bmuddha merged 3 commits intobmuddha/epic/replication-servicefrom
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThe PR adds a bounded per-slot block history to the executor: Assessment against linked issues
Out-of-scope changes
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@magicblock-processor/src/executor/mod.rs`:
- Around line 175-177: The code currently calls
transition_to_slot(transaction.slot) and proceeds to execute the transaction
even if the slot transition failed; modify transition_to_slot (or its caller) so
it returns a clear success/failure indicator (e.g., Result<(), SlotError> or
bool) and update the call site in the executor loop to check that return value
and early-return/skip execution when the transition fails (do not call
execute_transaction or any downstream logic). Ensure you update all call sites
noted (the block around where transaction.slot != self.processor.slot and the
nearby ranges referenced) to handle the failure case consistently and preserve
existing logging from transition_to_slot.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0c01457c-e664-4bda-bd60-3db752062f14
📒 Files selected for processing (2)
magicblock-ledger/src/lib.rsmagicblock-processor/src/executor/mod.rs
There was a problem hiding this comment.
♻️ Duplicate comments (1)
magicblock-processor/src/executor/mod.rs (1)
171-177:⚠️ Potential issue | 🔴 CriticalPrevent stale-context execution when slot transition is unavailable.
Line 175 currently proceeds even if
transition_to_slotcannot resolve the requested slot (Line 213), andbiasedselection prioritizes transactions over pending block updates (Line 171), which can trigger this path under normal timing. This can execute transactions with the wrong slot/blockhash/sysvars.Proposed fix
tokio::select! { biased; + _ = block_updated.recv() => { + // Unlock to allow global ops (snapshots), then + // register the new block for future transactions + RwLockReadGuard::unlock_fair(guard); + self.register_new_block(); + guard = sync.read(); + } txn = self.rx.recv() => { let Some(transaction) = txn else { break }; - if transaction.slot != self.processor.slot { - self.transition_to_slot(transaction.slot); + if transaction.slot != self.processor.slot + && !self.transition_to_slot(transaction.slot) + { + warn!( + requested_slot = transaction.slot, + current_slot = self.processor.slot, + "cannot execute transaction: slot context unavailable" + ); + let _ = self.ready_tx.try_send(self.id); + continue; } match transaction.txn.mode { TransactionProcessingMode::Execution(_) => { self.execute(transaction, None); } @@ - _ = block_updated.recv() => { - // Unlock to allow global ops (snapshots), then - // register the new block for future transactions - RwLockReadGuard::unlock_fair(guard); - self.register_new_block(); - guard = sync.read(); - } else => break, } @@ - fn transition_to_slot(&mut self, slot: Slot) { + fn transition_to_slot(&mut self, slot: Slot) -> bool { let Some(block) = self.block_history.get(&slot) else { // should never happen in practice warn!(slot, "tried to transition to slot which wasn't registered"); - return; + return false; }; self.environment.blockhash = block.blockhash; self.processor.slot = block.slot; self.set_sysvars(block); + true }Also applies to: 191-197, 212-217
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-processor/src/executor/mod.rs` around lines 171 - 177, The code currently proceeds to handle a received transaction immediately after calling transition_to_slot(transaction.slot), which can fail and leave execution using stale slot/blockhash/sysvars; update the txn handling paths that call transition_to_slot (the tokio::select branch reading from self.rx.recv and the other similar branches around the txn handling at lines 191-197 and 212-217) to check the result of transition_to_slot and abort/skipping processing when it fails: after calling transition_to_slot(transaction.slot) verify its return (Result/Option/boolean) and if it indicates failure, log or drop the transaction and continue (do not execute the transaction or call downstream processing using the old context); apply the same guard to every branch that consumes a transaction from self.rx.recv so no transaction is processed unless transition_to_slot succeeded.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@magicblock-processor/src/executor/mod.rs`:
- Around line 171-177: The code currently proceeds to handle a received
transaction immediately after calling transition_to_slot(transaction.slot),
which can fail and leave execution using stale slot/blockhash/sysvars; update
the txn handling paths that call transition_to_slot (the tokio::select branch
reading from self.rx.recv and the other similar branches around the txn handling
at lines 191-197 and 212-217) to check the result of transition_to_slot and
abort/skipping processing when it fails: after calling
transition_to_slot(transaction.slot) verify its return (Result/Option/boolean)
and if it indicates failure, log or drop the transaction and continue (do not
execute the transaction or call downstream processing using the old context);
apply the same guard to every branch that consumes a transaction from
self.rx.recv so no transaction is processed unless transition_to_slot succeeded.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 722e34ae-51a5-4d53-a1a0-e6131dafee9f
📒 Files selected for processing (1)
magicblock-processor/src/executor/mod.rs
d59919b to
9743ae6
Compare
edf70ca to
1e97a73
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
magicblock-processor/src/executor/mod.rs (2)
156-157:⚠️ Potential issue | 🟠 MajorAvoid
.expect()in production code.While runtime creation failure is likely fatal, the coding guidelines require proper error handling for
.expect()calls.As per coding guidelines: "Treat any usage of
.unwrap()or.expect()in production Rust code as a MAJOR issue."Proposed fix - propagate error or log and abort cleanly
pub(super) fn spawn(self) { std::thread::spawn(move || { - let runtime = Builder::new_current_thread() + let Ok(runtime) = Builder::new_current_thread() .thread_name(format!("txn-executor-{}", self.id)) .build() - .expect("Failed to build executor runtime"); + else { + tracing::error!(executor_id = self.id, "Failed to build executor runtime"); + return; + }; runtime.block_on(tokio::task::unconstrained(self.run())); }); }Alternatively, if
spawncan return aResult, propagate the error to the caller.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-processor/src/executor/mod.rs` around lines 156 - 157, The .expect() on the executor runtime construction (the call ending in .build().expect("Failed to build executor runtime")) must be replaced with proper error handling: change the code that calls .build() to capture the Result, then either return/propagate the error to the caller (e.g., convert to a suitable Error/Result from the surrounding function) or log the error via your logger and exit cleanly; ensure you refer to the executor runtime build call and any surrounding function (the function that spawns or initializes the executor) when adding the Result handling so the failure path is handled without panicking.
225-225:⚠️ Potential issue | 🟠 MajorHandle potential lock poisoning instead of using
.unwrap().The
.unwrap()on the sysvar cache lock violates the coding guideline for this path. While lock poisoning is rare, production code should handle it gracefully.As per coding guidelines: "Treat any usage of
.unwrap()or.expect()in production Rust code as a MAJOR issue."Proposed fix
- let mut cache = self.processor.writable_sysvar_cache().write().unwrap(); + let Ok(mut cache) = self.processor.writable_sysvar_cache().write() else { + warn!("sysvar cache lock poisoned, skipping sysvar update"); + return; + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-processor/src/executor/mod.rs` at line 225, The current call let mut cache = self.processor.writable_sysvar_cache().write().unwrap(); uses unwrap() and must handle PoisonError instead; replace the unwrap with explicit handling of the Result from write() on the RwLock returned by writable_sysvar_cache(): match or map the Result, log the poisoning (using the existing logger or process context available on self.processor) and recover with poison_error.into_inner() if safe to continue, or return/propagate a descriptive error from the enclosing function; update any code that assumes cache to use the recovered guard variable. Ensure you reference writable_sysvar_cache(), the write() call, and the local cache variable when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@magicblock-processor/src/executor/mod.rs`:
- Around line 156-157: The .expect() on the executor runtime construction (the
call ending in .build().expect("Failed to build executor runtime")) must be
replaced with proper error handling: change the code that calls .build() to
capture the Result, then either return/propagate the error to the caller (e.g.,
convert to a suitable Error/Result from the surrounding function) or log the
error via your logger and exit cleanly; ensure you refer to the executor runtime
build call and any surrounding function (the function that spawns or initializes
the executor) when adding the Result handling so the failure path is handled
without panicking.
- Line 225: The current call let mut cache =
self.processor.writable_sysvar_cache().write().unwrap(); uses unwrap() and must
handle PoisonError instead; replace the unwrap with explicit handling of the
Result from write() on the RwLock returned by writable_sysvar_cache(): match or
map the Result, log the poisoning (using the existing logger or process context
available on self.processor) and recover with poison_error.into_inner() if safe
to continue, or return/propagate a descriptive error from the enclosing
function; update any code that assumes cache to use the recovered guard
variable. Ensure you reference writable_sysvar_cache(), the write() call, and
the local cache variable when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d579261d-e3ed-4fd2-81d2-0c1ddfab8eff
📒 Files selected for processing (2)
magicblock-ledger/src/lib.rsmagicblock-processor/src/executor/mod.rs
9743ae6 to
d927cc0
Compare
1e97a73 to
b2051bf
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
magicblock-processor/src/executor/mod.rs (2)
156-157:⚠️ Potential issue | 🟠 MajorConsider handling runtime creation failure gracefully.
The
.expect()on runtime creation violates the coding guidelines. While failing to create an executor runtime is arguably unrecoverable, the guideline requires proper error handling or explicit justification.Proposed fix
If this is intentionally a fatal error, add a comment documenting the invariant:
let runtime = Builder::new_current_thread() .thread_name(format!("txn-executor-{}", self.id)) .build() - .expect("Failed to build executor runtime"); + // Runtime creation failure is unrecoverable - executor cannot function + // without a tokio runtime, so panicking here is acceptable. + .expect("executor runtime creation failed: cannot proceed");Alternatively, propagate the error to the caller if startup failures should be handled:
pub(super) fn spawn(self) -> std::io::Result<()> { std::thread::spawn(move || { let runtime = Builder::new_current_thread() .thread_name(format!("txn-executor-{}", self.id)) .build()?; runtime.block_on(tokio::task::unconstrained(self.run())); Ok(()) }); Ok(()) }As per coding guidelines: "Treat any usage of
.unwrap()or.expect()in production Rust code as a MAJOR issue."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-processor/src/executor/mod.rs` around lines 156 - 157, The call to Builder::new_current_thread().build().expect(...) in the spawn method should not use expect; either propagate the build error from spawn (e.g., make pub(super) fn spawn(...) -> Result<()> and return the error) or, if this is genuinely fatal, add a clear comment documenting the invariant that runtime construction cannot fail and why panic is acceptable; update the spawn closure around thread::spawn, reference runtime creation (Builder::new_current_thread / .build()) and the run method, and ensure any returned error is handled or propagated instead of unwrapping.
225-225:⚠️ Potential issue | 🟠 MajorReplace
.unwrap()with proper error handling.The
.unwrap()on the sysvar cache write lock violates the coding guidelines. While lock poisoning is rare, the guideline requires proper error handling or explicit justification.Proposed fix
- let mut cache = self.processor.writable_sysvar_cache().write().unwrap(); + let Ok(mut cache) = self.processor.writable_sysvar_cache().write() else { + warn!("sysvar cache lock poisoned, skipping sysvar update"); + return; + };Alternatively, if lock poisoning is considered an unrecoverable state in this context, document the invariant:
- let mut cache = self.processor.writable_sysvar_cache().write().unwrap(); + // SAFETY: Lock poisoning indicates a prior panic during sysvar update, + // which should not occur under normal operation. Panic is acceptable here. + let mut cache = self.processor.writable_sysvar_cache().write() + .expect("sysvar cache lock poisoned");As per coding guidelines: "Treat any usage of
.unwrap()or.expect()in production Rust code as a MAJOR issue."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-processor/src/executor/mod.rs` at line 225, The line using self.processor.writable_sysvar_cache().write().unwrap() must not call unwrap; replace it with proper handling of the RwLock write result from writable_sysvar_cache() (the PoisonError from write()). Update the enclosing function to propagate a Result or map the PoisonError into your crate's error type (or return an explicit error/log and early-return) so the variable cache still becomes a RwLockWriteGuard on success; if you truly consider a poisoned lock unrecoverable, replace unwrap with a clear panic!() or expect() that documents the invariant and reason for unrecoverability and references writable_sysvar_cache and the cache variable so reviewers can see the intent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@magicblock-processor/src/executor/mod.rs`:
- Around line 156-157: The call to
Builder::new_current_thread().build().expect(...) in the spawn method should not
use expect; either propagate the build error from spawn (e.g., make pub(super)
fn spawn(...) -> Result<()> and return the error) or, if this is genuinely
fatal, add a clear comment documenting the invariant that runtime construction
cannot fail and why panic is acceptable; update the spawn closure around
thread::spawn, reference runtime creation (Builder::new_current_thread /
.build()) and the run method, and ensure any returned error is handled or
propagated instead of unwrapping.
- Line 225: The line using
self.processor.writable_sysvar_cache().write().unwrap() must not call unwrap;
replace it with proper handling of the RwLock write result from
writable_sysvar_cache() (the PoisonError from write()). Update the enclosing
function to propagate a Result or map the PoisonError into your crate's error
type (or return an explicit error/log and early-return) so the variable cache
still becomes a RwLockWriteGuard on success; if you truly consider a poisoned
lock unrecoverable, replace unwrap with a clear panic!() or expect() that
documents the invariant and reason for unrecoverability and references
writable_sysvar_cache and the cache variable so reviewers can see the intent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 966ed573-ffbe-48d7-bada-a5ecef1ffe6c
📒 Files selected for processing (2)
magicblock-ledger/src/lib.rsmagicblock-processor/src/executor/mod.rs
d927cc0 to
3b575fd
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@magicblock-processor/src/executor/mod.rs`:
- Around line 212-220: transition_to_slot currently copies the slot-hash sysvar
from the prior context and then only adds the new slot, which leaves later slot
hashes around when you move backwards; modify transition_to_slot (and/or
set_sysvars) so that when switching to a given slot it reconstructs or prunes
the SlotHashes sysvar from block_history for that exact target slot (i.e., clear
any existing slot-hashes entries and populate SlotHashes only with hashes <= the
requested block.slot derived from block_history), ensuring environment.blockhash
/ self.processor.slot reflect the slot and the SlotHashes cache contains no
future hashes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 26771f16-7541-4e50-a1cf-841492c0cf28
📒 Files selected for processing (2)
magicblock-ledger/src/lib.rsmagicblock-processor/src/executor/mod.rs
3b575fd to
cb838e8
Compare
45018ff to
eefdb62
Compare
cb838e8 to
732611b
Compare
eefdb62 to
98304e5
Compare
732611b to
d177f49
Compare
d177f49 to
bfbc0ac
Compare
98304e5 to
39d041e
Compare
Block history can be used to set block related environment on a per transaction basis.
bfbc0ac to
fe95c1b
Compare
39d041e to
c0c77aa
Compare
thlorenz
left a comment
There was a problem hiding this comment.
LGTM with one question.
There is a blockhash sysvar as well in Solana. Where is this updated and how is that synchronized with the update here?
5e18b78
into
bmuddha/epic/replication-service

Summary
Keep limited block history in transaction executor. This allows for the executor to set the environment per each transaction individually.
Compatibility
Checklist
Summary by CodeRabbit