fix: replication state mismatch#1118
fix: replication state mismatch#1118bmuddha wants to merge 9 commits intobmuddha/replicator/authfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR removes faucet keypair handling and mint-related genesis inputs, disables the free airdrop RPC, and deletes related helpers. Ledger and scheduler APIs were refactored to introduce and use LatestBlockInner. Chainlink no longer accepts a faucet pubkey, its blacklist/reset logic simplified, and a StubbedChainlink type was added. JsonRpcServer now records/exposes bound HTTP and WS addresses. Replicator integrates StubbedChainlink, reduces snapshot size limits, adds monotonic snapshot-watching with open retries, and adjusts leader/standby behavior and logging. AccountsDb predicate APIs changed to Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
8f539a7 to
b961ff9
Compare
69b2741 to
9b54125
Compare
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
magicblock-processor/src/scheduler/state.rs (1)
109-114:⚠️ Potential issue | 🟠 MajorInconsistent error handling:
runtime_v2uses.expect()whileruntime_v1uses graceful fallback.The runtime_v1 path (lines 95-102) gracefully handles registration failure with
.unwrap_or_else(), returning a default loader. However,runtime_v2will panic on failure.Consider applying consistent error handling:
🛠️ Suggested fix for consistent error handling
let runtime_v2 = { let mut runtime = create_program_runtime_environment_v2( &Default::default(), false, ); - runtime - .register_function("sol_matmul_i8", SyscallMatmulI8::vm) - .expect( - "failed to register sol_matmul_i8 in runtime environment v2", - ); + if let Err(e) = runtime.register_function("sol_matmul_i8", SyscallMatmulI8::vm) { + tracing::error!("failed to register sol_matmul_i8 in runtime environment v2: {e}"); + } runtime };As per coding guidelines: "Treat any usage of
.unwrap()or.expect()in production Rust code as a MAJOR issue."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-processor/src/scheduler/state.rs` around lines 109 - 114, The runtime_v2 registration currently uses .expect(), which will panic on failure; change it to match the runtime_v1 pattern by handling errors gracefully (use .unwrap_or_else or matching on the Result) when calling runtime.register_function("sol_matmul_i8", SyscallMatmulI8::vm) so that a failed registration returns the same default/fallback loader or logs and continues instead of panicking; reference the runtime_v2 block and make its error path consistent with the runtime_v1 unwrap_or_else handling around register_function.magicblock-aperture/src/state/mod.rs (1)
54-75: 🧹 Nitpick | 🔵 TrivialConsider using
#[derive(Clone)]forNodeContext.All fields (
Pubkey,u64,Arc<FeatureSet>,u64) implementClone, so the manual implementation can be replaced with a derive.♻️ Suggested simplification
/// Holds the core configuration and runtime parameters that define the node's operational context. -#[derive(Default)] +#[derive(Default, Clone)] pub struct NodeContext { /// The public key of the validator node. pub identity: Pubkey, /// Base fee charged for transaction execution per signature. pub base_fee: u64, /// Runtime features activated for this node (used to compute fees) pub featureset: Arc<FeatureSet>, /// Block production time in milliseconds pub blocktime: u64, } - -impl Clone for NodeContext { - fn clone(&self) -> Self { - Self { - identity: self.identity, - base_fee: self.base_fee, - featureset: self.featureset.clone(), - blocktime: self.blocktime, - } - } -}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-aperture/src/state/mod.rs` around lines 54 - 75, Replace the manual Clone impl for NodeContext with a derived Clone: remove the explicit impl Clone block and add #[derive(Clone)] to the NodeContext struct definition (keeping the existing #[derive(Default)] or combining them as #[derive(Default, Clone)]), since all fields (identity: Pubkey, base_fee: u64, featureset: Arc<FeatureSet>, blocktime: u64) already implement Clone.magicblock-replicator/src/service/mod.rs (1)
144-145:⚠️ Potential issue | 🟠 Major
.expect()in production code requires proper error handling or justification.Per coding guidelines,
.expect()should not be used in production Rust code without explicit justification. While runtime creation failure is typically unrecoverable, consider returning aResultfromspawnor documenting the invariant.🔧 Option 1: Return Result from spawn
- pub fn spawn(self) -> JoinHandle<Result<()>> { + pub fn spawn(self) -> std::result::Result<JoinHandle<Result<()>>, std::io::Error> { std::thread::spawn(move || { let runtime = Builder::new_current_thread() .enable_all() .thread_name("replication-service") - .build() - .expect("Failed to build replication service runtime"); + .build()?; - runtime.block_on(tokio::task::unconstrained(self.run())) + Ok(runtime.block_on(tokio::task::unconstrained(self.run()))) }) + // Note: This changes the signature; adjust callers accordingly }🔧 Option 2: Add invariant documentation
+ // INVARIANT: Runtime creation only fails under severe system resource exhaustion + // (e.g., thread/memory limits). At this point, the process cannot recover meaningfully. let runtime = Builder::new_current_thread() .enable_all() .thread_name("replication-service") .build() .expect("Failed to build replication service runtime");As per coding guidelines: "Treat any usage of
.unwrap()or.expect()in production Rust code as a MAJOR issue."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-replicator/src/service/mod.rs` around lines 144 - 145, The call that constructs the runtime uses .expect("Failed to build replication service runtime") inside spawn, which violates the no-.expect() policy; modify spawn to return a Result (e.g., Result<JoinHandle<...>, Error> or Result<(), Error>) and propagate the runtime build error instead of panicking, or if you can prove the build can never fail, add a clear doc comment on spawn explaining the invariant and why panic is acceptable; locate the runtime build call (the .build() on the replication service runtime) and replace the .expect() with a ?-style propagation into spawn's returned Result or add the invariant documentation on the spawn function signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Cargo.toml`:
- Around line 103-105: The workspace-level dependency for magicblock-chainlink
currently enables the "dev-context" feature which pulls test-only code into all
dependent production crates (replicator, accounts, aperture, account-cloner,
validator); remove "dev-context" from the workspace dependency declaration in
Cargo.toml and instead either add magicblock-chainlink with feature =
["dev-context"] to [dev-dependencies] where tests need it, enable the feature
only in the specific crate-level dependencies that require dev utilities, or use
Cargo feature/profile overrides to ensure "dev-context" is disabled for release
builds—update the Cargo.toml entry for magicblock-chainlink and any crate-level
dependency entries accordingly.
In `@magicblock-accounts-db/src/lib.rs`:
- Around line 411-412: The public method database_directory() currently returns
&self.snapshot_manager.snapshots_dir (which is db_path.parent()) but should
return the actual storage/index directory; update database_directory() to return
the real database path by delegating to the existing database_path()
implementation (or otherwise returning the same Path used by database_path())
instead of snapshots_dir so callers continue receiving the accounts DB storage
location.
In `@magicblock-accounts-db/src/snapshot.rs`:
- Around line 144-150: The temp-file rename is not crash-safe because after
fs::rename the directory entry isn't durable until the parent directory is
fsynced; fix by adding a helper (e.g., sync_parent_dir_after_rename or
persist_renamed_file) that, after performing fs::rename(temp_path, final_path),
opens the parent directory
(std::fs::OpenOptions::new().read(true).open(parent_dir)) and calls sync_all()
on that directory handle, and call this helper from the current snapshot
creation path and from insert_external_snapshot so both paths fsync the
containing directory after the rename.
In `@magicblock-aperture/src/lib.rs`:
- Around line 53-57: Replace the unchecked increment of the HTTP port with a
safe checked_add to avoid u16 overflow: where ws_addr is derived from http_addr
using config.listen.0.port() + 1, call checked_add(1) on config.listen.0.port()
and handle the None case (overflow) by returning an error or falling back to a
validated behavior; you can alternatively enforce this in ApertureConfig
validation to reject port 65535. Update the logic that sets
ws_addr.set_port(...) to use the checked value and handle the overflow outcome
(e.g., propagate a descriptive error from the function that constructs the
listener).
In `@magicblock-aperture/src/requests/http/request_airdrop.rs`:
- Around line 11-42: The function in request_airdrop.rs currently returns
Err(RpcError::invalid_request("free airdrop faucet is disabled"))
unconditionally, but the file still contains a commented-out airdrop transfer
flow and an inaccurate rustdoc; remove the dead/commented airdrop implementation
and update or remove the function's rustdoc/comments so they reflect the
always-fail behavior (referencing the HandlerResult return,
RpcError::invalid_request call, and any TODO about reintroducing airdrops) to
avoid confusion and make auditing clear.
In `@magicblock-api/src/magic_validator.rs`:
- Around line 608-610: The current check in magic_validator.rs that skips ledger
replay when accountsdb_slot >= ledger_slot is incorrect because accountsdb
fast-forward (see magicblock-accounts-db::fast_forward behavior around
lib.rs:364-388) leaves self.ledger.latest_block() stale when accountsdb_slot >
ledger_slot; change the logic to only treat equality (accountsdb_slot ==
ledger_slot) as safe to skip replay and handle the strictly-ahead case by either
invoking a ledger fast-forward routine (so self.ledger.latest_block() is
advanced) or returning an explicit startup error; locate the conditional using
accountsdb_slot and ledger_slot and update it to require equality, adding the
call to the ledger fast-forward function or raise an error when accountsdb_slot
> ledger_slot.
In `@magicblock-chainlink/src/chainlink/mod.rs`:
- Around line 203-208: The struct initialization currently sets both
delegated_not_undelegating and kept_delegated to the same value
(delegated_only), causing redundant logging; remove the redundant field by
deleting the kept_delegated = delegated_only entry (or, if a different value was
intended, replace kept_delegated with the correct variable) in the block that
constructs the logging/reporting object where total_removed,
delegated_not_undelegating, delegated_and_undelegating, kept_blacklisted, and
kept_ephemeral are set.
In `@magicblock-processor/src/scheduler/mod.rs`:
- Line 374: The hasher update currently unconditionally includes
txn.transaction.signature() which causes Simulation (and possibly Replay)
signatures to affect blockhash; change the logic in the scheduler where
self.hasher.update(txn.transaction.signature().as_ref()) is called to only run
for canonical, replicated transactions (use the execution-mode check — e.g.,
gate the update with an is_execution / txn.is_execution() or equivalent check on
txn.mode). If Replay should also be included, make that explicit by checking
txn.mode == Execution or txn.mode == Replay as appropriate; otherwise restrict
to Execution so only replicated transactions contribute to blockhash.
In `@magicblock-replicator/src/nats/mod.rs`:
- Around line 42-44: The comment above the size constants is stale and
contradicts the values; update the comment to match the actual constants
STREAM_BYTES and SNAPSHOT_BYTES by replacing "256 GB stream, 512 GB snapshots"
with the correct sizes corresponding to the defined values (2 GiB stream, 5 GiB
snapshots) so that the comment accurately reflects STREAM_BYTES and
SNAPSHOT_BYTES.
In `@magicblock-replicator/src/service/context.rs`:
- Around line 42-45: Update the temporary TODO comment next to the
ReplicationContext field `chainlink: StubbedChainlink<AccountsDb>` in context.rs
to reference a tracked issue (e.g., include an issue ID like GH-####) and the
author tag (bmuddha) so the technical debt is discoverable; if the issue doesn't
exist, open one to track removal of the chainlink dependency once accounts
management moves to accountsdb, then replace the inline TODO with a TODO that
includes the created issue number and a short removal criterion so future
readers can find and close the debt when refactoring is done.
- Line 112: The 200ms hard sleep
(tokio::time::sleep(Duration::from_millis(200)).await) is a "magic delay" that
needs explicit documentation or replacement; add a concise comment immediately
above that line in context.rs explaining why the delay is required (e.g., to
allow scheduler startup/warmup or to work around a specific race), reference the
exact race or bug ID if known, and document expected behavior and risks
(flakiness) and a TODO to replace it with a proper synchronization primitive
(e.g., await a Scheduler::idle_notifier, notify/wait, or join handle) — or if
you can, replace the sleep with a deterministic wait using the scheduler's
readiness/idle API (e.g., wait_for_scheduler_idle or Scheduler::notify/await)
and remove the magic number.
- Around line 185-186: Before calling self.chainlink.reset_accounts_bank()
during promotion, acquire the scheduler idle guard by awaiting wait_for_idle()
to prevent concurrent scheduler transactions from mutating accounts; insert let
_guard = self.scheduler.wait_for_idle().await; immediately before
self.chainlink.reset_accounts_bank()? and keep existing error propagation.
Update the surrounding async context if needed so wait_for_idle().await is
legal; ensure this mirrors the synchronization used by write_block and
verify_checksum and does not rely on has_pending() alone.
In `@magicblock-replicator/src/watcher.rs`:
- Around line 59-67: The closure passed to notify::recommended_watcher currently
swallows notify backend errors by using `let Ok(event) = res else { return; }`;
update this to explicitly match the Err case and log the error (e.g. `Err(e) =>
{ error!("Watcher backend error: {}", e); return; }`) so overflow/backend
failures are visible; keep the existing call to Self::process_event(&event) and
the tx.blocking_send(path) path unchanged except for adding any context to its
error log if desired.
- Around line 129-138: open_with_retry currently returns None after exhausting
OPEN_RETRIES which silently drops the archive; change its signature from async
fn open_with_retry(path: &Path) -> Option<File> to async fn
open_with_retry(path: &Path) -> Result<File, std::io::Error>, capture and return
the last io::Error after the retry loop (using OPEN_RETRIES and
OPEN_RETRY_DELAY), and update callers (e.g., wherever open_with_retry is
awaited) to handle the Err case by logging the error and requeuing or
propagating the snapshot for later retry instead of dropping it.
---
Outside diff comments:
In `@magicblock-aperture/src/state/mod.rs`:
- Around line 54-75: Replace the manual Clone impl for NodeContext with a
derived Clone: remove the explicit impl Clone block and add #[derive(Clone)] to
the NodeContext struct definition (keeping the existing #[derive(Default)] or
combining them as #[derive(Default, Clone)]), since all fields (identity:
Pubkey, base_fee: u64, featureset: Arc<FeatureSet>, blocktime: u64) already
implement Clone.
In `@magicblock-processor/src/scheduler/state.rs`:
- Around line 109-114: The runtime_v2 registration currently uses .expect(),
which will panic on failure; change it to match the runtime_v1 pattern by
handling errors gracefully (use .unwrap_or_else or matching on the Result) when
calling runtime.register_function("sol_matmul_i8", SyscallMatmulI8::vm) so that
a failed registration returns the same default/fallback loader or logs and
continues instead of panicking; reference the runtime_v2 block and make its
error path consistent with the runtime_v1 unwrap_or_else handling around
register_function.
In `@magicblock-replicator/src/service/mod.rs`:
- Around line 144-145: The call that constructs the runtime uses .expect("Failed
to build replication service runtime") inside spawn, which violates the
no-.expect() policy; modify spawn to return a Result (e.g.,
Result<JoinHandle<...>, Error> or Result<(), Error>) and propagate the runtime
build error instead of panicking, or if you can prove the build can never fail,
add a clear doc comment on spawn explaining the invariant and why panic is
acceptable; locate the runtime build call (the .build() on the replication
service runtime) and replace the .expect() with a ?-style propagation into
spawn's returned Result or add the invariant documentation on the spawn function
signature.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0491724e-e3f4-4814-8ffc-21f5519c7d42
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (45)
.gitignore.vscode/settings.jsonCargo.tomlmagicblock-accounts-db/src/lib.rsmagicblock-accounts-db/src/snapshot.rsmagicblock-accounts-db/src/traits.rsmagicblock-aperture/src/lib.rsmagicblock-aperture/src/requests/http/request_airdrop.rsmagicblock-aperture/src/state/mod.rsmagicblock-aperture/src/tests.rsmagicblock-aperture/tests/mocked.rsmagicblock-aperture/tests/setup.rsmagicblock-aperture/tests/transactions.rsmagicblock-api/src/fund_account.rsmagicblock-api/src/genesis_utils.rsmagicblock-api/src/ledger.rsmagicblock-api/src/magic_validator.rsmagicblock-chainlink/src/accounts_bank.rsmagicblock-chainlink/src/chainlink/blacklisted_accounts.rsmagicblock-chainlink/src/chainlink/fetch_cloner/mod.rsmagicblock-chainlink/src/chainlink/fetch_cloner/tests.rsmagicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/tests/utils/test_context.rsmagicblock-ledger/src/blockstore_processor/mod.rsmagicblock-ledger/src/lib.rsmagicblock-ledger/src/store/api.rsmagicblock-ledger/tests/get_block.rsmagicblock-ledger/tests/test_ledger_truncator.rsmagicblock-magic-program-api/Cargo.tomlmagicblock-processor/src/scheduler/mod.rsmagicblock-processor/src/scheduler/state.rsmagicblock-processor/tests/scheduling.rsmagicblock-replicator/Cargo.tomlmagicblock-replicator/src/error.rsmagicblock-replicator/src/nats/broker.rsmagicblock-replicator/src/nats/mod.rsmagicblock-replicator/src/service/context.rsmagicblock-replicator/src/service/mod.rsmagicblock-replicator/src/service/primary.rsmagicblock-replicator/src/service/standby.rsmagicblock-replicator/src/tests.rsmagicblock-replicator/src/watcher.rstest-integration/test-chainlink/src/ixtest_context.rstest-integration/test-chainlink/src/test_context.rstest-kit/src/lib.rs
💤 Files with no reviewable changes (9)
- .vscode/settings.json
- magicblock-chainlink/tests/utils/test_context.rs
- magicblock-processor/tests/scheduling.rs
- test-integration/test-chainlink/src/ixtest_context.rs
- magicblock-api/src/fund_account.rs
- magicblock-api/src/ledger.rs
- test-integration/test-chainlink/src/test_context.rs
- magicblock-chainlink/src/chainlink/fetch_cloner/tests.rs
- magicblock-api/src/genesis_utils.rs
8bbb177 to
5aeab24
Compare
b961ff9 to
709b6d0
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
magicblock-ledger/src/lib.rs (1)
38-49: 🧹 Nitpick | 🔵 TrivialAdd documentation clarifying the slot vs clock.slot distinction in
LatestBlockInner::new().The
slot + 1semantics are intentional and correct:slotrepresents the just-finalized block, whileclock.slotrepresents the slot currently being processed. The scheduler explicitly uses this pattern (advancingself.slot = block.clock.slotafter finalization), confirming the distinction aligns with standard slot progression. However, the constructor lacks documentation explaining this relationship, which can confuse future maintainers. Consider adding a comment above or within the constructor explaining thatclock.slotis intentionally set one ahead of the block's slot to represent the next processing slot.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-ledger/src/lib.rs` around lines 38 - 49, Add a clarifying doc comment to LatestBlockInner::new explaining the slot vs clock.slot semantics: state that the function accepts the finalized block's slot (stored in self.slot) while it intentionally sets clock.slot = slot + 1 to represent the next slot being processed by the scheduler; mention that this mirrors the scheduler behavior (advancing self.slot = block.clock.slot after finalization) so readers understand why clock.slot is one ahead. Ensure the comment is placed above the pub fn new(slot: u64, blockhash: Hash, timestamp: i64) -> Self { ... } constructor and references the fields slot and clock.slot for clarity.magicblock-chainlink/src/chainlink/mod.rs (1)
158-200:⚠️ Potential issue | 🔴 CriticalFix the ownership check to use the correct DLP program ID constant.
The code checks
feature::ID(Solana's feature program) instead of the DLP program. The error message "Removing non-delegated, non-DLP-owned account" confirms the intent is to preserve DLP-owned accounts, butsolana_sdk_ids::feature::IDis the Solana Feature program (used for runtime feature activations), not the DLP program. This logic error causes incorrect account filtering. Replace withdlp_api::id()to match the intended behavior and codebase patterns.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-chainlink/src/chainlink/mod.rs` around lines 158 - 200, In reset_accounts_bank, the ownership check currently compares *account.owner() to feature::ID (Solana feature program) instead of the DLP program ID; update the comparison to use dlp_api::id() so DLP-owned accounts are preserved (locate the closure passed to self.accounts_bank.remove_where in reset_accounts_bank and replace feature::ID with dlp_api::id()).
♻️ Duplicate comments (1)
magicblock-chainlink/src/chainlink/mod.rs (1)
202-210: 🧹 Nitpick | 🔵 TrivialMinor: Logging contains redundant field.
The log statement includes both
delegated_not_undelegating = delegated_onlyandkept_delegated = delegated_onlywith identical values. Consider removing the duplicate.🔧 Remove redundant logging field
info!( total_removed = removed, delegated_not_undelegating = delegated_only, delegated_and_undelegating = undelegating, - kept_delegated = delegated_only, kept_blacklisted = blacklisted, kept_ephemeral, + remaining, "Removed accounts from bank" );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-chainlink/src/chainlink/mod.rs` around lines 202 - 210, The info! log in mod.rs is emitting the same value twice by mapping both delegated_not_undelegating = delegated_only and kept_delegated = delegated_only; remove the redundant field by deleting the kept_delegated = delegated_only entry (or, if intended, map kept_delegated to a different variable), i.e., update the info! macro invocation so each structured field (e.g., delegated_not_undelegating, kept_delegated) is unique and not duplicated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.gitignore:
- Line 23: The ignore pattern "magicblock-test-storage*" is too broad; update
the .gitignore entry to scope it to directories only (e.g., replace
"magicblock-test-storage*" with a directory-scoped pattern such as
"magicblock-test-storage/" or "magicblock-test-storage/**") or, if you only mean
ledger storage directories, use a more specific name like
"magicblock-test-storage-ledger/" so non-ledger files with the same prefix
aren’t accidentally ignored.
In `@magicblock-processor/src/scheduler/mod.rs`:
- Around line 473-479: Move the accountsdb slot advancement to occur only after
a successful ledger write when running as primary: in finalize_block, keep the
unconditional path for non-primary nodes as-is, but inside the
coordinator.is_primary() branch call self.ledger.write_block(block) first and
only on success invoke self.accountsdb.set_slot(block.slot); on write failure
log the error (using the existing inspect_err/error! pattern) and do not advance
accountsdb.slot to avoid creating a mismatched accountsdb vs ledger state that
could skip replay on restart; reference finalize_block, self.ledger.write_block,
self.accountsdb.set_slot, and coordinator.is_primary.
In `@magicblock-replicator/src/nats/mod.rs`:
- Around line 42-44: The README's statement claiming "AccountsDb archives (512
GB max)" is now out of sync with the code constant SNAPSHOT_BYTES (set to 1024 *
1024 * 1024 = 1 GiB); update the README text to reflect the new snapshot size
limit (e.g., change the copy to "AccountsDb archives (1 GiB max)" or similar),
and verify any accompanying explanation or unit notation matches the
SNAPSHOT_BYTES value and uses GiB consistently with STREAM_BYTES/SNAPSHOT_BYTES
in the code.
In `@magicblock-replicator/src/service/mod.rs`:
- Line 51: The LEADER_TIMEOUT constant was lowered from 10s to 5s which
increases takeover sensitivity; restore safe behavior by either reverting
LEADER_TIMEOUT back to Duration::from_secs(10) or, better, make it configurable:
replace the hardcoded pub(crate) const LEADER_TIMEOUT with a configurable field
(e.g., Config::leader_timeout) or read from an environment/flag (e.g.,
LEADER_TIMEOUT_SECS) and parse into Duration::from_secs, update all uses that
reference LEADER_TIMEOUT to use the config value, and add a fallback default of
10s plus a comment explaining the choice so timeouts align with heartbeat
intervals.
In `@magicblock-replicator/src/service/standby.rs`:
- Around line 147-154: Fix the typo in the inline comment above the NATS ack
logic: change "boudaries" to "boundaries" in the comment that explains batching
by slot (the comment referencing slot boundaries near the if current_slot <
self.ctx.slot check and the msg.ack().await handling in standby.rs), leaving the
rest of the comment and logic unchanged.
In `@magicblock-replicator/src/tests.rs`:
- Around line 77-85: The test currently only asserts a timeout when an older
snapshot (slot 9) appears, which can pass spuriously; instead after creating the
older snapshot file (snapshot-000000000009.tar.gz) enqueue a newer snapshot
(snapshot-000000000011.tar.gz) and assert that watcher.recv() returns the slot
11 notification (use tokio::time::timeout to await watcher.recv(), unwrap the Ok
value and assert it corresponds to slot 11) to prove the watcher made forward
progress and actually discarded the older snapshot rather than stalling; locate
the code around older_path, the watcher.recv() call, and temp_dir to add the
creation of the slot 11 file and the recv() assertion.
---
Outside diff comments:
In `@magicblock-chainlink/src/chainlink/mod.rs`:
- Around line 158-200: In reset_accounts_bank, the ownership check currently
compares *account.owner() to feature::ID (Solana feature program) instead of the
DLP program ID; update the comparison to use dlp_api::id() so DLP-owned accounts
are preserved (locate the closure passed to self.accounts_bank.remove_where in
reset_accounts_bank and replace feature::ID with dlp_api::id()).
In `@magicblock-ledger/src/lib.rs`:
- Around line 38-49: Add a clarifying doc comment to LatestBlockInner::new
explaining the slot vs clock.slot semantics: state that the function accepts the
finalized block's slot (stored in self.slot) while it intentionally sets
clock.slot = slot + 1 to represent the next slot being processed by the
scheduler; mention that this mirrors the scheduler behavior (advancing self.slot
= block.clock.slot after finalization) so readers understand why clock.slot is
one ahead. Ensure the comment is placed above the pub fn new(slot: u64,
blockhash: Hash, timestamp: i64) -> Self { ... } constructor and references the
fields slot and clock.slot for clarity.
---
Duplicate comments:
In `@magicblock-chainlink/src/chainlink/mod.rs`:
- Around line 202-210: The info! log in mod.rs is emitting the same value twice
by mapping both delegated_not_undelegating = delegated_only and kept_delegated =
delegated_only; remove the redundant field by deleting the kept_delegated =
delegated_only entry (or, if intended, map kept_delegated to a different
variable), i.e., update the info! macro invocation so each structured field
(e.g., delegated_not_undelegating, kept_delegated) is unique and not duplicated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: daee1669-b86b-412e-9cbb-72bc52dc4238
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (44)
.gitignore.vscode/settings.jsonCargo.tomlmagicblock-accounts-db/src/lib.rsmagicblock-accounts-db/src/snapshot.rsmagicblock-accounts-db/src/traits.rsmagicblock-aperture/src/lib.rsmagicblock-aperture/src/requests/http/request_airdrop.rsmagicblock-aperture/src/state/mod.rsmagicblock-aperture/src/tests.rsmagicblock-aperture/tests/mocked.rsmagicblock-aperture/tests/setup.rsmagicblock-aperture/tests/transactions.rsmagicblock-api/src/fund_account.rsmagicblock-api/src/genesis_utils.rsmagicblock-api/src/ledger.rsmagicblock-api/src/magic_validator.rsmagicblock-chainlink/src/accounts_bank.rsmagicblock-chainlink/src/chainlink/blacklisted_accounts.rsmagicblock-chainlink/src/chainlink/fetch_cloner/mod.rsmagicblock-chainlink/src/chainlink/fetch_cloner/tests.rsmagicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/tests/utils/test_context.rsmagicblock-ledger/src/blockstore_processor/mod.rsmagicblock-ledger/src/lib.rsmagicblock-ledger/src/store/api.rsmagicblock-ledger/tests/get_block.rsmagicblock-ledger/tests/test_ledger_truncator.rsmagicblock-processor/src/scheduler/mod.rsmagicblock-processor/src/scheduler/state.rsmagicblock-processor/tests/scheduling.rsmagicblock-replicator/Cargo.tomlmagicblock-replicator/src/error.rsmagicblock-replicator/src/nats/broker.rsmagicblock-replicator/src/nats/mod.rsmagicblock-replicator/src/service/context.rsmagicblock-replicator/src/service/mod.rsmagicblock-replicator/src/service/primary.rsmagicblock-replicator/src/service/standby.rsmagicblock-replicator/src/tests.rsmagicblock-replicator/src/watcher.rstest-integration/test-chainlink/src/ixtest_context.rstest-integration/test-chainlink/src/test_context.rstest-kit/src/lib.rs
💤 Files with no reviewable changes (9)
- .vscode/settings.json
- magicblock-chainlink/tests/utils/test_context.rs
- test-integration/test-chainlink/src/ixtest_context.rs
- magicblock-processor/tests/scheduling.rs
- magicblock-chainlink/src/chainlink/fetch_cloner/tests.rs
- magicblock-api/src/ledger.rs
- magicblock-api/src/fund_account.rs
- test-integration/test-chainlink/src/test_context.rs
- magicblock-api/src/genesis_utils.rs
5aeab24 to
e0df3ad
Compare
709b6d0 to
b0e4aef
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
magicblock-chainlink/src/chainlink/blacklisted_accounts.rs (1)
13-31:⚠️ Potential issue | 🟡 MinorMake the validator-specific blacklist entry optional.
This helper now always blacklists its argument. Callers that only need the static sysvar/native-program set have to invent a pubkey, and that key gets filtered too. Accept
Option<&Pubkey>or split the static set from the validator-specific addition.Possible fix
-pub fn blacklisted_accounts(validator_id: &Pubkey) -> HashSet<Pubkey> { +pub fn blacklisted_accounts( + validator_id: Option<&Pubkey>, +) -> HashSet<Pubkey> { // This is buried in the accounts_db::native_mint module and we don't // want to take a dependency on that crate just for this ID which won't change const NATIVE_SOL_ID: Pubkey = Pubkey::from_str_const("So11111111111111111111111111111111111111112"); @@ - blacklisted_accounts.insert(*validator_id); + if let Some(validator_id) = validator_id { + blacklisted_accounts.insert(*validator_id); + } blacklisted_accounts }Callers with a real validator id can pass
Some(&validator_pubkey); utility callers can passNone.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-chainlink/src/chainlink/blacklisted_accounts.rs` around lines 13 - 31, Change blacklisted_accounts to accept Option<&Pubkey> instead of &Pubkey and only insert the validator id when Some(...) is provided: keep the static set building (sysvar_accounts, native_program_accounts, stake::config::ID, NATIVE_SOL_ID, magic_program::ID, magic_program::MAGIC_CONTEXT_PUBKEY, magic_program::EPHEMERAL_VAULT_PUBKEY) unchanged, and wrap the insertion of the validator pubkey (currently inserting *validator_id) in an if let Some(id) { ... } so callers can pass Some(&validator_pubkey) to add it or None to get only the static set.magicblock-ledger/src/lib.rs (1)
69-73: 🛠️ Refactor suggestion | 🟠 MajorGuard
LatestBlockInnerinvariants before storing it.
store()now trusts a caller-builtLatestBlockInner, but downstream code relies onblock.slotandblock.clock.slothaving different fixed meanings. Please validateblock.slot.checked_add(1) == Some(block.clock.slot)here, or hide the fields behind the constructor, so a malformed caller cannot silently desync latest-block state.🧩 Minimal guard
pub fn store(&self, block: LatestBlockInner) { + debug_assert_eq!( + block.slot.checked_add(1), + Some(block.clock.slot), + "LatestBlockInner invariant violated: clock.slot must be slot + 1", + ); self.inner.store(block.clone().into()); // Broadcast the update. It's okay if there are no active listeners. let _ = self.notifier.send(block);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-ledger/src/lib.rs` around lines 69 - 73, The store() method currently accepts a caller-constructed LatestBlockInner and stores/broadcasts it without validating invariants; add a guard inside store() to ensure block.slot.checked_add(1) == Some(block.clock.slot) before calling self.inner.store(...) and sending via self.notifier.send(...), and if the check fails either return an Err / early-return (or panic/log and skip storing) so malformed LatestBlockInner values cannot desynchronize state; alternatively make LatestBlockInner fields private and force construction through a validated constructor/new that enforces the same invariant and then call that validated constructor from callers instead of trusting external construction.magicblock-accounts-db/src/lib.rs (1)
439-442:⚠️ Potential issue | 🟠 MajorPropagate index scan failures from
remove_where().Lines 439-442 currently route through
iter_all(), anditer_all()turnsget_all_accounts()failures into an empty iterator. That makesremove_where()returnOk(0)on index read failure, so callers likeChainlink::reset_accounts_bank()silently skip cleanup and keep stale state behind.Suggested fix
let to_remove = self - .iter_all() - .filter_map(|(pk, acc)| predicate(&pk, &acc).then_some(pk)) + .index + .get_all_accounts()? + .into_iter() + .filter_map(|(offset, pk)| { + let acc = self.storage.read_account(offset); + predicate(&pk, &acc).then_some(pk) + }) .collect::<Vec<_>>();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-accounts-db/src/lib.rs` around lines 439 - 442, remove_where currently calls iter_all which swallows get_all_accounts errors and makes remove_where return Ok(0) on index scan failures; change remove_where to propagate index-scan errors by calling get_all_accounts (or updating iter_all to return a Result iterator) and returning Err when get_all_accounts fails instead of treating it as an empty iterator—ensure remove_where (and callers like Chainlink::reset_accounts_bank) receive and propagate the original error from get_all_accounts so failures aren’t masked.
♻️ Duplicate comments (8)
.gitignore (1)
23-23:⚠️ Potential issue | 🟡 MinorScope the storage ignore pattern to directories only.
Line 23 (
magicblock-test-storage*) is overly broad and can ignore unintended files/directories sharing the prefix. Restrict it to directory scope (for examplemagicblock-test-storage*/or a more specific directory name).Suggested change
-magicblock-test-storage* +magicblock-test-storage*/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore at line 23, The .gitignore entry "magicblock-test-storage*" is too broad and may exclude files that only share the prefix; narrow it to directories only by replacing that pattern with a directory-scoped pattern (e.g., "magicblock-test-storage*/" or the exact test storage directory name) so only directories named with that prefix are ignored; update the entry referencing "magicblock-test-storage*" accordingly.magicblock-replicator/src/watcher.rs (2)
121-123:⚠️ Potential issue | 🟠 MajorRetry exhaustion currently drops snapshots without any signal.
At Line 121, failure from
open_with_retryjustcontinues; at Line 129,open_with_retryreturnsNonesilently after all retries. This can hide missed snapshots and is hard to operate/debug.Suggested patch
async fn open_with_retry(path: &Path) -> Option<File> { + let mut last_err = None; for _ in 0..OPEN_RETRIES { - if let Ok(file) = File::open(path).await { - return Some(file); - } + match File::open(path).await { + Ok(file) => return Some(file), + Err(err) => last_err = Some(err), + } tokio::time::sleep(OPEN_RETRY_DELAY).await; } + error!( + path = %path.display(), + error = ?last_err, + retries = OPEN_RETRIES, + "Failed to open snapshot after retries" + ); None }Also applies to: 129-139
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-replicator/src/watcher.rs` around lines 121 - 123, When open_with_retry(&path) returns None in watcher.rs (the call inside the loop) do not silently continue; instead surface the failure by logging an error with the path and retry details and either incrementing a missed-snapshot metric or sending a notification/error via the existing error channel so operators can see missed snapshots; update both places that handle the None return from open_with_retry to include a descriptive processLogger/error log (including path and retry info) and to propagate or record the failure (e.g., via the watcher’s error/metrics channel) rather than dropping the snapshot silently.
60-62:⚠️ Potential issue | 🟠 MajorDon’t silently swallow watcher backend errors.
At Line 60,
Errfromnotifyis dropped with no log. This hides overflow/backend faults and makes snapshot loss opaque.Suggested patch
- let Ok(event) = res else { - return; - }; + let event = match res { + Ok(event) => event, + Err(err) => { + error!(error = ?err, "Snapshot watcher backend error"); + return; + } + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-replicator/src/watcher.rs` around lines 60 - 62, The code currently swallows errors from the notify watcher by using `let Ok(event) = res else { return; };` which returns silently on `Err`; change this to log the error (include the `res`/`Err` value) and continue the loop instead of returning so watcher backend errors (e.g., overflow) are visible; update the block in watcher.rs around the `res`/`event` handling to call the project's logger (or `log::error!`) with a descriptive message and the error detail, then continue to the next iteration rather than exiting the function.magicblock-replicator/src/tests.rs (1)
83-85:⚠️ Potential issue | 🟡 MinorStrengthen this assertion to prove forward progress, not just timeout behavior.
At Line 83, a timeout after writing slot
9can also happen ifwatcher.recv()is stalled. Add a subsequent slot11write and assertrecv()returns11to verify older snapshots are ignored while newer ones still flow.Suggested patch
tokio::time::timeout(Duration::from_millis(300), watcher.recv()) .await .expect_err("older snapshot should be ignored"); + + let newest_path = temp_dir.path().join("snapshot-000000000011.tar.gz"); + std::fs::File::create(&newest_path) + .unwrap() + .write_all(b"newest") + .unwrap(); + + let (_, slot) = + tokio::time::timeout(Duration::from_secs(2), watcher.recv()) + .await + .expect("Timeout waiting for newer snapshot") + .expect("Channel closed"); + assert_eq!(slot, 11);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-replicator/src/tests.rs` around lines 83 - 85, The current test only asserts a timeout on tokio::time::timeout(Duration::from_millis(300), watcher.recv()) which could be caused by a stalled watcher; after triggering the older snapshot (slot 9) you should write a newer snapshot (slot 11) and then call watcher.recv() again and assert it yields slot 11 to prove forward progress and that older snapshots are ignored; locate the test block containing watcher.recv() and the slot 9 write, perform the write for slot 11 using the same helper used earlier for writing slots, then replace the simple timeout-based expect_err with a sequence that times out for the older snapshot and then asserts the next recv() returns the newer slot value (11).magicblock-accounts-db/src/snapshot.rs (1)
153-160:⚠️ Potential issue | 🟠 Major
rename()still is not crash-safe without syncing the parent directory.Lines 154 and 209 sync the file, then rename it into place, but the new directory entry is still vulnerable until the containing directory is fsynced. A crash in that window can lose the archive entirely.
Run this to verify the two rename paths still lack a post-rename directory sync:
#!/bin/bash set -euo pipefail echo "=== Snapshot rename paths ===" rg -n -C3 'archive_and_register|insert_external_snapshot|fs::rename|sync_all\(' magicblock-accounts-db/src/snapshot.rs echo echo "=== Parent-directory sync helpers in magicblock-accounts-db ===" rg -n -C2 'sync_parent|parent.*sync_all|OpenOptions::new\(\)\.read\(\)\.open\(' magicblock-accounts-db --type rustExpected result: both rename paths are visible in
magicblock-accounts-db/src/snapshot.rs, but there is no helper or follow-up parent-directorysync_all()after those renames.Also applies to: 209-215
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-accounts-db/src/snapshot.rs` around lines 153 - 160, The rename operation using fs::rename(&tmp_path, &archive_path) is not crash-safe because the parent directory isn't fsynced after creating the new directory entry; update both call sites (the snapshot creation path in archive_and_register and the external path in insert_external_snapshot where tmp_path/archive_path are used) to open the parent directory (e.g., OpenOptions::new().read(true).open(parent_dir)) and call sync_all() on that directory after the fs::rename succeeds (and before returning), ensuring errors are propagated via the existing log_err/? flow so the rename + parent dir sync make the operation durable.magicblock-aperture/src/lib.rs (1)
53-58:⚠️ Potential issue | 🟠 MajorGuard the WS port derivation against
u16overflow.Line 57 does unchecked
u16arithmetic. If the HTTP port is65535, debug builds panic and release builds wrap the WS port to0.Possible fix
if config.listen.0.port() == 0 { ws_addr.set_port(0); } else { - ws_addr.set_port(config.listen.0.port() + 1); + let ws_port = config.listen.0.port().checked_add(1).ok_or_else( + || { + RpcError::internal(std::io::Error::new( + std::io::ErrorKind::InvalidInput, + "HTTP port 65535 leaves no room for a WS port", + )) + }, + )?; + ws_addr.set_port(ws_port); }Run this to confirm there is no existing guard for
65535:#!/bin/bash set -euo pipefail echo "=== WebSocket port derivation ===" rg -n -C2 'set_port\(|port\(\)\s*\+\s*1|checked_add\(' magicblock-aperture/src/lib.rs echo echo "=== Aperture listen-port validation ===" rg -n -C3 'ApertureConfig|fn validate|65535|checked_add' magicblock-aperture magicblock-config --type rustExpected result:
magicblock-aperture/src/lib.rsstill showsport() + 1, and the config search does not show a validation path that rejects65535or otherwise guards this addition.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-aperture/src/lib.rs` around lines 53 - 58, The current WS port derivation does unchecked u16 arithmetic (ws_addr.set_port(config.listen.0.port() + 1)), which can overflow for port 65535; change this to perform a checked add on config.listen.0.port() (use port().checked_add(1)) and handle the None case explicitly (e.g. return an Err or log/fail with a clear message) instead of letting it wrap/panic; update the ws_addr.set_port call to use the checked result and reference the existing variables/ws fields (ws_addr, http_addr, config.listen) so the overflow path is guarded.magicblock-api/src/magic_validator.rs (1)
612-614:⚠️ Potential issue | 🟠 Major
accountsdb_slot > ledger_slotis not safe to skip.Equality is fine, but a strictly ahead
AccountsDbleavesledger.latest_block()stale. This file already snapshots it on Line 176 and seedsbuild_svm_envfrom it on Line 339, so returning here can start execution and RPC with the wrong slot/blockhash. Only skip on equality; ifAccountsDbis ahead, fast-forward the ledger too or fail startup explicitly.magicblock-processor/src/scheduler/mod.rs (1)
473-478:⚠️ Potential issue | 🟠 MajorPersist the block before advancing
accountsdb.slot().Line 474 runs even when
write_block()fails. Because the replicator now seeds restart state fromaccountsdb.slot(), this can leave recovery one slot ahead of the durable ledger and skip replay of the failed block.Suggested fix
async fn finalize_block(&self, block: LatestBlockInner) { - self.accountsdb.set_slot(block.slot); if self.coordinator.is_primary() { - let _ = self.ledger.write_block(block).inspect_err( - |error| error!(%error, "failed to write block to the ledger"), - ); + if let Err(error) = self.ledger.write_block(block.clone()) { + error!(%error, "failed to write block to the ledger"); + return; + } } + self.accountsdb.set_slot(block.slot); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-processor/src/scheduler/mod.rs` around lines 473 - 478, finalize_block advances accountsdb.slot() before ensuring the block was durably written, which can leave recovery ahead of the ledger if write_block fails; change the flow in finalize_block so that when coordinator.is_primary() you first attempt ledger.write_block(block) and only on successful completion set accountsdb.set_slot(block.slot), logging errors on failure (do not advance the slot on Err). Use the existing symbols ledger.write_block, accountsdb.set_slot, and coordinator.is_primary to locate the code and move the set_slot call to the write_block success path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@magicblock-chainlink/src/chainlink/mod.rs`:
- Around line 190-195: The error! log inside the cleanup path (when
should_remove is true) is emitting a full formatted account payload via
format!("{account:#?}"), which will allocate and flood logs; change this to a
lower-severity log and avoid the expensive full-formatting: replace the error!
call with debug! (or info! if you need visibility) and log the account with a
cheap representation (e.g. account = ?account or omit the expanded debug print)
and keep pubkey = %pubkey so the removal is recorded without allocating the full
account payload in the error path.
In `@magicblock-processor/src/scheduler/mod.rs`:
- Around line 439-445: The current timestamp construction uses
SystemTime::now().duration_since(UNIX_EPOCH).unwrap(), which will panic on clock
skew; change it to handle the Result and provide a safe fallback instead: call
SystemTime::now().duration_since(UNIX_EPOCH) and match on Ok(dur) to use
dur.as_secs() as i64, and on Err(_) log or silently handle the error and use a
deterministic fallback timestamp (e.g., 0 or previous timestamp) before
constructing LatestBlockInner::new(self.slot, blockhash, timestamp) so the
scheduler doesn't panic on backwards system clocks.
In `@magicblock-replicator/src/service/standby.rs`:
- Around line 147-154: The current batch-ack-on-slot-boundary logic can hit
JetStream's max_ack_pending limit; update the logic in standby.rs (around the
block using current_slot, self.ctx.slot and msg.ack()) to implement a
per-message ACK fallback: keep a counter (e.g., self.pending_ack_count) that
increments for every intra-slot message you do not immediately ack, and if it
reaches or exceeds the consumer's max_ack_pending threshold (512) call
msg.ack().await for that message (and reset/decrement the counter as
appropriate); otherwise continue to defer ack until the slot boundary and still
log any ack errors via warn!(%error, ...). Ensure the counter is part of the
same struct (e.g., StandbyService or self) and is updated
atomically/consistently with slot changes so you still ack the remaining batch
on slot boundary.
---
Outside diff comments:
In `@magicblock-accounts-db/src/lib.rs`:
- Around line 439-442: remove_where currently calls iter_all which swallows
get_all_accounts errors and makes remove_where return Ok(0) on index scan
failures; change remove_where to propagate index-scan errors by calling
get_all_accounts (or updating iter_all to return a Result iterator) and
returning Err when get_all_accounts fails instead of treating it as an empty
iterator—ensure remove_where (and callers like Chainlink::reset_accounts_bank)
receive and propagate the original error from get_all_accounts so failures
aren’t masked.
In `@magicblock-chainlink/src/chainlink/blacklisted_accounts.rs`:
- Around line 13-31: Change blacklisted_accounts to accept Option<&Pubkey>
instead of &Pubkey and only insert the validator id when Some(...) is provided:
keep the static set building (sysvar_accounts, native_program_accounts,
stake::config::ID, NATIVE_SOL_ID, magic_program::ID,
magic_program::MAGIC_CONTEXT_PUBKEY, magic_program::EPHEMERAL_VAULT_PUBKEY)
unchanged, and wrap the insertion of the validator pubkey (currently inserting
*validator_id) in an if let Some(id) { ... } so callers can pass
Some(&validator_pubkey) to add it or None to get only the static set.
In `@magicblock-ledger/src/lib.rs`:
- Around line 69-73: The store() method currently accepts a caller-constructed
LatestBlockInner and stores/broadcasts it without validating invariants; add a
guard inside store() to ensure block.slot.checked_add(1) ==
Some(block.clock.slot) before calling self.inner.store(...) and sending via
self.notifier.send(...), and if the check fails either return an Err /
early-return (or panic/log and skip storing) so malformed LatestBlockInner
values cannot desynchronize state; alternatively make LatestBlockInner fields
private and force construction through a validated constructor/new that enforces
the same invariant and then call that validated constructor from callers instead
of trusting external construction.
---
Duplicate comments:
In @.gitignore:
- Line 23: The .gitignore entry "magicblock-test-storage*" is too broad and may
exclude files that only share the prefix; narrow it to directories only by
replacing that pattern with a directory-scoped pattern (e.g.,
"magicblock-test-storage*/" or the exact test storage directory name) so only
directories named with that prefix are ignored; update the entry referencing
"magicblock-test-storage*" accordingly.
In `@magicblock-accounts-db/src/snapshot.rs`:
- Around line 153-160: The rename operation using fs::rename(&tmp_path,
&archive_path) is not crash-safe because the parent directory isn't fsynced
after creating the new directory entry; update both call sites (the snapshot
creation path in archive_and_register and the external path in
insert_external_snapshot where tmp_path/archive_path are used) to open the
parent directory (e.g., OpenOptions::new().read(true).open(parent_dir)) and call
sync_all() on that directory after the fs::rename succeeds (and before
returning), ensuring errors are propagated via the existing log_err/? flow so
the rename + parent dir sync make the operation durable.
In `@magicblock-aperture/src/lib.rs`:
- Around line 53-58: The current WS port derivation does unchecked u16
arithmetic (ws_addr.set_port(config.listen.0.port() + 1)), which can overflow
for port 65535; change this to perform a checked add on config.listen.0.port()
(use port().checked_add(1)) and handle the None case explicitly (e.g. return an
Err or log/fail with a clear message) instead of letting it wrap/panic; update
the ws_addr.set_port call to use the checked result and reference the existing
variables/ws fields (ws_addr, http_addr, config.listen) so the overflow path is
guarded.
In `@magicblock-processor/src/scheduler/mod.rs`:
- Around line 473-478: finalize_block advances accountsdb.slot() before ensuring
the block was durably written, which can leave recovery ahead of the ledger if
write_block fails; change the flow in finalize_block so that when
coordinator.is_primary() you first attempt ledger.write_block(block) and only on
successful completion set accountsdb.set_slot(block.slot), logging errors on
failure (do not advance the slot on Err). Use the existing symbols
ledger.write_block, accountsdb.set_slot, and coordinator.is_primary to locate
the code and move the set_slot call to the write_block success path.
In `@magicblock-replicator/src/tests.rs`:
- Around line 83-85: The current test only asserts a timeout on
tokio::time::timeout(Duration::from_millis(300), watcher.recv()) which could be
caused by a stalled watcher; after triggering the older snapshot (slot 9) you
should write a newer snapshot (slot 11) and then call watcher.recv() again and
assert it yields slot 11 to prove forward progress and that older snapshots are
ignored; locate the test block containing watcher.recv() and the slot 9 write,
perform the write for slot 11 using the same helper used earlier for writing
slots, then replace the simple timeout-based expect_err with a sequence that
times out for the older snapshot and then asserts the next recv() returns the
newer slot value (11).
In `@magicblock-replicator/src/watcher.rs`:
- Around line 121-123: When open_with_retry(&path) returns None in watcher.rs
(the call inside the loop) do not silently continue; instead surface the failure
by logging an error with the path and retry details and either incrementing a
missed-snapshot metric or sending a notification/error via the existing error
channel so operators can see missed snapshots; update both places that handle
the None return from open_with_retry to include a descriptive
processLogger/error log (including path and retry info) and to propagate or
record the failure (e.g., via the watcher’s error/metrics channel) rather than
dropping the snapshot silently.
- Around line 60-62: The code currently swallows errors from the notify watcher
by using `let Ok(event) = res else { return; };` which returns silently on
`Err`; change this to log the error (include the `res`/`Err` value) and continue
the loop instead of returning so watcher backend errors (e.g., overflow) are
visible; update the block in watcher.rs around the `res`/`event` handling to
call the project's logger (or `log::error!`) with a descriptive message and the
error detail, then continue to the next iteration rather than exiting the
function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3fd761f6-fee6-4462-b3d5-d0ad40a2c5ea
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (44)
.gitignore.vscode/settings.jsonCargo.tomlmagicblock-accounts-db/src/lib.rsmagicblock-accounts-db/src/snapshot.rsmagicblock-accounts-db/src/traits.rsmagicblock-aperture/src/lib.rsmagicblock-aperture/src/requests/http/request_airdrop.rsmagicblock-aperture/src/state/mod.rsmagicblock-aperture/src/tests.rsmagicblock-aperture/tests/mocked.rsmagicblock-aperture/tests/setup.rsmagicblock-aperture/tests/transactions.rsmagicblock-api/src/fund_account.rsmagicblock-api/src/genesis_utils.rsmagicblock-api/src/ledger.rsmagicblock-api/src/magic_validator.rsmagicblock-chainlink/src/accounts_bank.rsmagicblock-chainlink/src/chainlink/blacklisted_accounts.rsmagicblock-chainlink/src/chainlink/fetch_cloner/mod.rsmagicblock-chainlink/src/chainlink/fetch_cloner/tests.rsmagicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/tests/utils/test_context.rsmagicblock-ledger/src/blockstore_processor/mod.rsmagicblock-ledger/src/lib.rsmagicblock-ledger/src/store/api.rsmagicblock-ledger/tests/get_block.rsmagicblock-ledger/tests/test_ledger_truncator.rsmagicblock-processor/src/scheduler/mod.rsmagicblock-processor/src/scheduler/state.rsmagicblock-processor/tests/scheduling.rsmagicblock-replicator/Cargo.tomlmagicblock-replicator/src/error.rsmagicblock-replicator/src/nats/broker.rsmagicblock-replicator/src/nats/mod.rsmagicblock-replicator/src/service/context.rsmagicblock-replicator/src/service/mod.rsmagicblock-replicator/src/service/primary.rsmagicblock-replicator/src/service/standby.rsmagicblock-replicator/src/tests.rsmagicblock-replicator/src/watcher.rstest-integration/test-chainlink/src/ixtest_context.rstest-integration/test-chainlink/src/test_context.rstest-kit/src/lib.rs
💤 Files with no reviewable changes (9)
- .vscode/settings.json
- test-integration/test-chainlink/src/ixtest_context.rs
- test-integration/test-chainlink/src/test_context.rs
- magicblock-processor/tests/scheduling.rs
- magicblock-chainlink/src/chainlink/fetch_cloner/tests.rs
- magicblock-api/src/fund_account.rs
- magicblock-api/src/ledger.rs
- magicblock-api/src/genesis_utils.rs
- magicblock-chainlink/tests/utils/test_context.rs
b0e4aef to
6553ce1
Compare
e0df3ad to
0b84ef5
Compare
6553ce1 to
ed5d218
Compare
0b84ef5 to
6836a20
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
magicblock-replicator/src/service/mod.rs (1)
140-148: 🧹 Nitpick | 🔵 TrivialConsider justifying the
.expect()with an invariant comment.The
.expect()on runtime creation is acceptable since the service cannot function without a runtime—this is an unrecoverable startup failure. However, per coding guidelines, consider adding a brief comment explaining why panic is appropriate here.📝 Suggested clarification
pub fn spawn(self) -> JoinHandle<Result<()>> { std::thread::spawn(move || { + // Runtime creation failure is unrecoverable; the service requires an async runtime. let runtime = Builder::new_current_thread() .enable_all() .thread_name("replication-service") .build() .expect("Failed to build replication service runtime");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-replicator/src/service/mod.rs` around lines 140 - 148, The call to Builder::new_current_thread().build().expect(...) in the thread spawned by std::thread::spawn can legitimately panic on failure but lacks an invariant comment; add a short justification comment immediately before the expect (near Builder::new_current_thread / runtime creation) stating that building the Tokio runtime is an unrecoverable startup failure for the replication service so a panic is intentional (the service cannot run without a runtime), leaving runtime.block_on(self.run()) unchanged.
♻️ Duplicate comments (5)
.gitignore (1)
23-23:⚠️ Potential issue | 🟡 MinorNarrow the storage ignore pattern to directory scope.
Line 23 uses
magicblock-test-storage*, which is broad and can hide unrelated paths sharing that prefix. Prefer a directory-scoped rule.Suggested change
-magicblock-test-storage* +magicblock-test-storage*/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore at line 23, The .gitignore entry using the pattern "magicblock-test-storage*" is too broad and may exclude unrelated files; update the rule to target the directory scope by replacing the glob with a directory-specific pattern such as "magicblock-test-storage/" (or "magicblock-test-storage/**" if you need recursive matching) so only the directory and its contents are ignored rather than any path that starts with that prefix.magicblock-replicator/src/tests.rs (1)
77-85: 🧹 Nitpick | 🔵 TrivialTest could be strengthened to prove forward progress.
The timeout-only assertion (expecting error when slot 9 isn't delivered) can pass spuriously if the watcher stalls entirely. Adding a subsequent higher slot (e.g., slot 11) and asserting it's received would confirm the watcher actively ignored slot 9 rather than stopping.
🔧 Suggested improvement
tokio::time::timeout(Duration::from_millis(300), watcher.recv()) .await .expect_err("older snapshot should be ignored"); + + // Prove forward progress by ensuring a newer slot is still delivered + let newest_path = temp_dir.path().join("snapshot-000000000011.tar.gz"); + std::fs::File::create(&newest_path) + .unwrap() + .write_all(b"newest") + .unwrap(); + + let (_, slot) = + tokio::time::timeout(Duration::from_secs(2), watcher.recv()) + .await + .expect("Timeout waiting for newer snapshot") + .expect("Channel closed"); + assert_eq!(slot, 11);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-replicator/src/tests.rs` around lines 77 - 85, The test currently only asserts that a timeout occurs for the older snapshot (created at older_path and expecting tokio::time::timeout(..., watcher.recv()) to fail), which can pass if the watcher stalls; update the test to also create a later snapshot (e.g., "snapshot-000000000011.tar.gz"), write to it, and then call watcher.recv() and assert the later slot is successfully received to prove forward progress. Specifically: after the timeout/assert_err for slot 9, create and write the slot 11 file, then await watcher.recv() (or similar receive method used in the test) and assert the received event corresponds to slot 11, ensuring the watcher ignored slot 9 but continued processing.magicblock-replicator/src/service/standby.rs (1)
147-154:⚠️ Potential issue | 🟠 MajorBatch ACK strategy can stall delivery when slots exceed 512 messages.
With
AckPolicy::Allandmax_ack_pending = 512in the JetStream consumer configuration (seemagicblock-replicator/src/nats/consumer.rs), this ACK-on-slot-boundary strategy leaves all intra-slot messages unacknowledged. When a slot contains more than 512 messages, the pending count hits the limit and JetStream stops delivering new messages until the slot boundary triggers the batch ACK.Consider adding a fallback that ACKs when approaching the pending limit:
Suggested approach
+ // Track pending un-acked messages + pending_acks: u32, +} + +const ACK_PENDING_THRESHOLD: u32 = 500; // Leave headroom below MAX_ACK_PENDING (512)Then in
handle_message:self.ctx.update_position(slot, index); + self.pending_acks += 1; // NOTE: // for performance reasons we batch messages from NATS and ack the // entire batch on slot boundaries, instead of on every message - if current_slot < self.ctx.slot { + if current_slot < self.ctx.slot || self.pending_acks >= ACK_PENDING_THRESHOLD { if let Err(error) = msg.ack().await { warn!(%error, "failed to ack nats message"); } + self.pending_acks = 0; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-replicator/src/service/standby.rs` around lines 147 - 154, The batch ACK-on-slot-boundary (in handle_message where current_slot < self.ctx.slot and msg.ack().await is called) leaves intra-slot messages unacknowledged and can stall JetStream when max_ack_pending (consumer config in consumer.rs) is hit (default 512); modify handle_message to track pending_acks count (or query consumer's pending status) and perform a fallback ack of the current message (call msg.ack().await) when pending_acks approaches the max_ack_pending threshold (e.g., >= 480) in addition to the existing slot-boundary ack logic, and ensure errors from msg.ack().await are logged as before.magicblock-chainlink/src/chainlink/mod.rs (2)
190-195:⚠️ Potential issue | 🟠 MajorDowngrade and slim this removal log.
Lines 191-194 can run once per removed account during normal bank reset.
error!plusformat!("{account:#?}")will flood logs, allocate the full payload, and the message is stale now that the guard isowner != feature::ID.🔧 Suggested change
- error!( - pubkey = %pubkey, - account=%format!("{account:#?}"), - "Removing non-delegated, non-DLP-owned account" - ); + debug!( + pubkey = %pubkey, + owner = %account.owner(), + delegated = account.delegated(), + undelegating = account.undelegating(), + ephemeral = account.ephemeral(), + confined = account.confined(), + "Removing non-delegated account during bank reset" + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-chainlink/src/chainlink/mod.rs` around lines 190 - 195, The removal log is too noisy and allocates a full pretty-printed account; change the error! call inside the should_remove branch to a lower log level (e.g., debug! or warn!) and avoid format!("{account:#?}") by logging only lightweight fields (e.g., pubkey and account.owner or account.pubkey). Update the message to reflect the current guard (owner != feature::ID) and remove the heavy formatting to prevent allocations when many accounts are removed.
203-208: 🧹 Nitpick | 🔵 TrivialDrop the duplicated
kept_delegatedfield.
delegated_not_undelegatingandkept_delegatedare both loggingdelegated_only, so the second field doesn't add any signal.🔧 Suggested change
info!( total_removed = removed, delegated_not_undelegating = delegated_only, delegated_and_undelegating = undelegating, - kept_delegated = delegated_only, kept_blacklisted = blacklisted, kept_ephemeral, "Removed accounts from bank" );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-chainlink/src/chainlink/mod.rs` around lines 203 - 208, The struct/log construction contains a duplicated field: both delegated_not_undelegating and kept_delegated are set to delegated_only; remove the redundant kept_delegated entry so only delegated_not_undelegating = delegated_only remains; update the surrounding comma/formatting if needed to keep the list valid (see the fields total_removed, delegated_not_undelegating, delegated_and_undelegating, kept_blacklisted, kept_ephemeral for context).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@magicblock-aperture/src/state/mod.rs`:
- Around line 66-75: The manual impl Clone for NodeContext (the clone method
copying identity, base_fee, featureset, blocktime) is unnecessary; remove the
impl Clone block and add #[derive(Clone)] to the NodeContext struct declaration
so the compiler-derived Clone is used for fields identity, base_fee, featureset,
and blocktime.
In `@magicblock-ledger/src/lib.rs`:
- Around line 38-41: LatestBlockInner::new sets Clock.slot to slot + 1 while
LatestBlockInner.slot remains slot, causing inconsistent slot values; update the
Clock initialization inside LatestBlockInner::new so Clock.slot matches
LatestBlockInner.slot (use the same slot value instead of slot + 1) to remove
the off-by-one discrepancy between Clock.slot and LatestBlockInner.slot.
In `@magicblock-replicator/src/nats/mod.rs`:
- Around line 42-44: Update the size-limit comment above the constants
STREAM_BYTES and SNAPSHOT_BYTES to use binary unit notation "GiB" instead of
"GB" (i.e., "256 GiB stream, 1 GiB snapshots") to match the 1024^3 calculations
in the constants; modify only the comment text so it accurately reflects the
units used by the constants.
---
Outside diff comments:
In `@magicblock-replicator/src/service/mod.rs`:
- Around line 140-148: The call to
Builder::new_current_thread().build().expect(...) in the thread spawned by
std::thread::spawn can legitimately panic on failure but lacks an invariant
comment; add a short justification comment immediately before the expect (near
Builder::new_current_thread / runtime creation) stating that building the Tokio
runtime is an unrecoverable startup failure for the replication service so a
panic is intentional (the service cannot run without a runtime), leaving
runtime.block_on(self.run()) unchanged.
---
Duplicate comments:
In @.gitignore:
- Line 23: The .gitignore entry using the pattern "magicblock-test-storage*" is
too broad and may exclude unrelated files; update the rule to target the
directory scope by replacing the glob with a directory-specific pattern such as
"magicblock-test-storage/" (or "magicblock-test-storage/**" if you need
recursive matching) so only the directory and its contents are ignored rather
than any path that starts with that prefix.
In `@magicblock-chainlink/src/chainlink/mod.rs`:
- Around line 190-195: The removal log is too noisy and allocates a full
pretty-printed account; change the error! call inside the should_remove branch
to a lower log level (e.g., debug! or warn!) and avoid format!("{account:#?}")
by logging only lightweight fields (e.g., pubkey and account.owner or
account.pubkey). Update the message to reflect the current guard (owner !=
feature::ID) and remove the heavy formatting to prevent allocations when many
accounts are removed.
- Around line 203-208: The struct/log construction contains a duplicated field:
both delegated_not_undelegating and kept_delegated are set to delegated_only;
remove the redundant kept_delegated entry so only delegated_not_undelegating =
delegated_only remains; update the surrounding comma/formatting if needed to
keep the list valid (see the fields total_removed, delegated_not_undelegating,
delegated_and_undelegating, kept_blacklisted, kept_ephemeral for context).
In `@magicblock-replicator/src/service/standby.rs`:
- Around line 147-154: The batch ACK-on-slot-boundary (in handle_message where
current_slot < self.ctx.slot and msg.ack().await is called) leaves intra-slot
messages unacknowledged and can stall JetStream when max_ack_pending (consumer
config in consumer.rs) is hit (default 512); modify handle_message to track
pending_acks count (or query consumer's pending status) and perform a fallback
ack of the current message (call msg.ack().await) when pending_acks approaches
the max_ack_pending threshold (e.g., >= 480) in addition to the existing
slot-boundary ack logic, and ensure errors from msg.ack().await are logged as
before.
In `@magicblock-replicator/src/tests.rs`:
- Around line 77-85: The test currently only asserts that a timeout occurs for
the older snapshot (created at older_path and expecting
tokio::time::timeout(..., watcher.recv()) to fail), which can pass if the
watcher stalls; update the test to also create a later snapshot (e.g.,
"snapshot-000000000011.tar.gz"), write to it, and then call watcher.recv() and
assert the later slot is successfully received to prove forward progress.
Specifically: after the timeout/assert_err for slot 9, create and write the slot
11 file, then await watcher.recv() (or similar receive method used in the test)
and assert the received event corresponds to slot 11, ensuring the watcher
ignored slot 9 but continued processing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 09b5a899-a990-46c0-93c7-70a916935733
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (44)
.gitignore.vscode/settings.jsonCargo.tomlmagicblock-accounts-db/src/lib.rsmagicblock-accounts-db/src/snapshot.rsmagicblock-accounts-db/src/traits.rsmagicblock-aperture/src/lib.rsmagicblock-aperture/src/requests/http/request_airdrop.rsmagicblock-aperture/src/state/mod.rsmagicblock-aperture/src/tests.rsmagicblock-aperture/tests/mocked.rsmagicblock-aperture/tests/setup.rsmagicblock-aperture/tests/transactions.rsmagicblock-api/src/fund_account.rsmagicblock-api/src/genesis_utils.rsmagicblock-api/src/ledger.rsmagicblock-api/src/magic_validator.rsmagicblock-chainlink/src/accounts_bank.rsmagicblock-chainlink/src/chainlink/blacklisted_accounts.rsmagicblock-chainlink/src/chainlink/fetch_cloner/mod.rsmagicblock-chainlink/src/chainlink/fetch_cloner/tests.rsmagicblock-chainlink/src/chainlink/mod.rsmagicblock-chainlink/tests/utils/test_context.rsmagicblock-ledger/src/blockstore_processor/mod.rsmagicblock-ledger/src/lib.rsmagicblock-ledger/src/store/api.rsmagicblock-ledger/tests/get_block.rsmagicblock-ledger/tests/test_ledger_truncator.rsmagicblock-processor/src/scheduler/mod.rsmagicblock-processor/src/scheduler/state.rsmagicblock-processor/tests/scheduling.rsmagicblock-replicator/Cargo.tomlmagicblock-replicator/src/error.rsmagicblock-replicator/src/nats/broker.rsmagicblock-replicator/src/nats/mod.rsmagicblock-replicator/src/service/context.rsmagicblock-replicator/src/service/mod.rsmagicblock-replicator/src/service/primary.rsmagicblock-replicator/src/service/standby.rsmagicblock-replicator/src/tests.rsmagicblock-replicator/src/watcher.rstest-integration/test-chainlink/src/ixtest_context.rstest-integration/test-chainlink/src/test_context.rstest-kit/src/lib.rs
💤 Files with no reviewable changes (9)
- .vscode/settings.json
- magicblock-chainlink/tests/utils/test_context.rs
- test-integration/test-chainlink/src/test_context.rs
- test-integration/test-chainlink/src/ixtest_context.rs
- magicblock-api/src/fund_account.rs
- magicblock-processor/tests/scheduling.rs
- magicblock-api/src/ledger.rs
- magicblock-chainlink/src/chainlink/fetch_cloner/tests.rs
- magicblock-api/src/genesis_utils.rs
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
magicblock-replicator/src/nats/mod.rs (1)
42-42:⚠️ Potential issue | 🟡 MinorUse
GiBin the size-limit comment for binary units.Line 42 says
GB, but the constants are binary (1024^3). Please update wording toGiBto avoid ops ambiguity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@magicblock-replicator/src/nats/mod.rs` at line 42, Update the inline comment that reads "Size limits (256 GB stream, 1 GB snapshots)" to use binary units by replacing "GB" with "GiB" so it reads "Size limits (256 GiB stream, 1 GiB snapshots)"; locate the comment near the NATS size limit constants (e.g., STREAM_SIZE_LIMIT and SNAPSHOT_SIZE_LIMIT) in the nats module and ensure the wording matches the binary 1024^3-based constants.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@magicblock-chainlink/src/chainlink/mod.rs`:
- Around line 190-194: The trace log message is misleading: when should_remove
is true we remove non-delegated accounts except when owner == feature::ID, so
update the trace in mod.rs (the trace call that logs pubkey and account) to
reflect that predicate (reference variables: should_remove, pubkey, account,
owner and feature::ID) — change the message to indicate "Removing non-delegated
account not owned by feature::ID" (or equivalent wording) so it matches the
actual condition.
- Around line 25-30: Wrap the test/dev-only imports (ChainRpcClientMock,
ChainPubsubClientMock, ClonerStub) and the StubbedChainlink type alias and
stub() function with #[cfg(any(test, feature = "dev-context"))] (or move them
into a feature-gated module) so they are not referenced in release builds
without the dev-context feature; in addition, change stub() so it does not
return a full production Chainlink with disabled behavior—either return a
distinct stub type (e.g., StubbedChainlink) behind the cfg gate or make stub()
private to the gated module to avoid exposing a silent no-op API path.
---
Duplicate comments:
In `@magicblock-replicator/src/nats/mod.rs`:
- Line 42: Update the inline comment that reads "Size limits (256 GB stream, 1
GB snapshots)" to use binary units by replacing "GB" with "GiB" so it reads
"Size limits (256 GiB stream, 1 GiB snapshots)"; locate the comment near the
NATS size limit constants (e.g., STREAM_SIZE_LIMIT and SNAPSHOT_SIZE_LIMIT) in
the nats module and ensure the wording matches the binary 1024^3-based
constants.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e16e4bcc-f052-4b8b-b696-141f41753adb
📒 Files selected for processing (4)
magicblock-aperture/src/state/mod.rsmagicblock-chainlink/src/chainlink/mod.rsmagicblock-replicator/src/nats/consumer.rsmagicblock-replicator/src/nats/mod.rs
💤 Files with no reviewable changes (1)
- magicblock-replicator/src/nats/consumer.rs
| chain_pubsub_client::mock::ChainPubsubClientMock, | ||
| chain_updates_client::ChainUpdatesClient, ChainPubsubClient, | ||
| ChainRpcClient, ChainRpcClientImpl, Endpoints, RemoteAccountProvider, | ||
| }, | ||
| submux::SubMuxClient, | ||
| testing::{cloner_stub::ClonerStub, rpc_client_mock::ChainRpcClientMock}, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== StubbedChainlink alias and stub() ==="
rg -n -C3 '\bpub\s+type\s+StubbedChainlink\b|\bpub\s+fn\s+stub\s*\(' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Methods that become silent no-ops when fetch_cloner is None ==="
rg -n -C2 'let Some\(fetch_cloner\) = self\.fetch_cloner\(\) else' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Concrete mock/stub type definitions and nearby cfg/module gates ==="
fd -t f -i 'rpc_client_mock.rs|cloner_stub.rs|chain_pubsub_client.rs|mod.rs|lib.rs' magicblock-chainlink/src \
-x rg -n -C3 '\bChainPubsubClientMock\b|\bChainRpcClientMock\b|\bClonerStub\b|#\[(cfg|cfg_attr)\([^]]*(test|feature)[^]]*\)\]|\b(pub\s+)?mod\s+testing\b|\bmod\s+testing\b|\b(pub\s+)?mod\s+mock\b|\bmod\s+mock\b' {}Repository: magicblock-labs/magicblock-validator
Length of output: 19483
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Full context around imports and stub() with cfg guards ==="
sed -n '10,35p' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Full context around stub() method with cfg guards ==="
sed -n '500,520p' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Check if stub() or containing impl block has cfg guards ==="
rg -n -B5 'pub fn stub' magicblock-chainlink/src/chainlink/mod.rs | head -20
echo
echo "=== Check pub use statements from testing module ==="
rg -n 'pub use.*testing|pub use.*mock' magicblock-chainlink/src/chainlink/mod.rsRepository: magicblock-labs/magicblock-validator
Length of output: 2216
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check if the imports from testing module are conditional ==="
sed -n '1,50p' magicblock-chainlink/src/chainlink/mod.rs | head -60
echo
echo "=== Verify compilation: check Cargo.toml for dev-context feature ==="
rg -n 'dev-context|test' magicblock-chainlink/Cargo.toml | head -20
echo
echo "=== Check if there's a cfg guard around the StubbedChainlink type alias and stub() ==="
rg -n -B10 'pub type StubbedChainlink' magicblock-chainlink/src/chainlink/mod.rs | head -30
echo
echo "=== Check if testing module re-exports or defines these types conditionally ==="
rg -n 'mod testing|pub mod testing' magicblock-chainlink/src/chainlink/mod.rs
rg -n 'pub use.*testing|pub use.*cloner_stub|pub use.*rpc_client_mock' magicblock-chainlink/src/chainlink/mod.rsRepository: magicblock-labs/magicblock-validator
Length of output: 2356
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check if testing module exists and is gated ==="
rg -n '#\[cfg' magicblock-chainlink/src/testing/mod.rs | head -5
echo
echo "=== Verify: is the testing module or its parents conditionally compiled? ==="
rg -n -B2 'pub mod testing\|mod testing' magicblock-chainlink/src/lib.rs magicblock-chainlink/src/mod.rs 2>/dev/null || echo "lib.rs/mod.rs not found"
echo
echo "=== Check the actual structure in lib.rs ==="
head -50 magicblock-chainlink/src/lib.rs 2>/dev/null || echo "lib.rs not found"
echo
echo "=== Locate where testing module is declared ==="
fd -t f 'lib.rs|mod.rs' magicblock-chainlink/src -x rg -l 'pub mod testing|mod testing' {}
echo
echo "=== What does the chainlink/mod.rs file actually depend on? List first 200 lines to find module declarations ==="
head -100 magicblock-chainlink/src/chainlink/mod.rs | tail -50Repository: magicblock-labs/magicblock-validator
Length of output: 882
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Critical: Verify the chainlink/mod.rs imports would fail if testing module is not compiled ==="
echo "The testing module is gated with #[cfg(any(test, feature = \"dev-context\"))]"
echo "But chainlink/mod.rs imports from it unconditionally at lines 25-30 and 30:"
rg -n 'testing::|chain_pubsub_client::mock::' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Check if these imports are wrapped in cfg gates ==="
sed -n '20,35p' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Verify: Can this code compile without the dev-context feature? ==="
echo "If the feature is optional (not always-on), this would be a compilation blocker."
echo "Checking Cargo.toml: is dev-context feature required or optional?"
grep -A5 '\[features\]' magicblock-chainlink/Cargo.toml || echo "No features section"
echo
echo "=== Check default features ==="
grep 'default.*=' magicblock-chainlink/Cargo.tomlRepository: magicblock-labs/magicblock-validator
Length of output: 1392
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== CRITICAL: Check if the entire chainlink/mod.rs or its impl is cfg-gated ==="
head -5 magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Check if chainlink module itself is declared with cfg ==="
rg -n 'pub mod chainlink' magicblock-chainlink/src/lib.rs
echo
echo "=== Attempt to understand the problem: does the crate fail to compile without dev-context? ==="
echo "Testing imports that reference testing module:"
sed -n '20,32p' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Verify the actual dependency graph: can chainlink/mod.rs be imported without testing? ==="
echo "Lines 25-30 reference:"
echo " - chain_pubsub_client::mock::ChainPubsubClientMock"
echo " - testing::{cloner_stub::ClonerStub, rpc_client_mock::ChainRpcClientMock}"
echo
echo "These are used in:"
rg -n 'StubbedChainlink|ChainRpcClientMock|ChainPubsubClientMock|ClonerStub' magicblock-chainlink/src/chainlink/mod.rs | head -10Repository: magicblock-labs/magicblock-validator
Length of output: 1589
🏁 Script executed:
#!/bin/bash
set -euo pipefixo pipefail
echo "=== FINAL VERIFICATION: Is the entire chainlink/mod.rs file or its imports cfg-guarded? ==="
echo "Checking if the use block containing testing imports is cfg-guarded..."
sed -n '1,35p' magicblock-chainlink/src/chainlink/mod.rs | grep -A15 '^use crate'
echo
echo "=== Does the pub type StubbedChainlink and pub fn stub have cfg guards? ==="
sed -n '38,55p' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Simulate release build: would these imports fail? ==="
echo "The testing module is #[cfg(any(test, feature = \"dev-context\"))]"
echo "dev-context is NOT in default features"
echo "Therefore: release builds cannot import from testing module"
echo
echo "Impact analysis:"
echo "1. chainlink/mod.rs imports testing unconditionally"
echo "2. testing module is feature-gated"
echo "3. feature is not default"
echo "4. Result: COMPILATION ERROR in release build"
echo
echo "=== Can we build without dev-context feature? Let's check dependencies ==="
grep -E 'features|dev-context' magicblock-chainlink/Cargo.tomlRepository: magicblock-labs/magicblock-validator
Length of output: 133
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== FINAL VERIFICATION: Is the entire chainlink/mod.rs file or its imports cfg-guarded? ==="
echo "Checking if the use block containing testing imports is cfg-guarded..."
sed -n '1,35p' magicblock-chainlink/src/chainlink/mod.rs | grep -A15 '^use crate'
echo
echo "=== Does the pub type StubbedChainlink and pub fn stub have cfg guards? ==="
sed -n '38,55p' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Check if chainlink module itself is conditionally compiled ==="
grep -B2 'pub mod chainlink' magicblock-chainlink/src/lib.rs
echo
echo "=== Simulate release build: would these imports fail? ==="
echo "The testing module is #[cfg(any(test, feature = \"dev-context\"))]"
echo "dev-context is NOT in default features"
echo "Therefore: release builds cannot import from testing module"
echo
echo "Impact analysis:"
echo "1. chainlink/mod.rs imports testing unconditionally"
echo "2. testing module is feature-gated"
echo "3. feature is not default"
echo "4. Result: COMPILATION ERROR in release build"
echo
echo "=== Can we build without dev-context feature? ==="
grep -E 'features|dev-context' magicblock-chainlink/Cargo.tomlRepository: magicblock-labs/magicblock-validator
Length of output: 2217
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check workspace Cargo.toml for default features ==="
find . -name "Cargo.toml" -type f -exec grep -l "workspace.*true" {} \; | head -1 | xargs head -30
echo
echo "=== Check if magicblock-chainlink is used elsewhere without dev-context feature ==="
rg -l 'magicblock-chainlink' --type toml | xargs grep -A2 'magicblock-chainlink' | head -30
echo
echo "=== Verify the compilation issue: these types MUST be available at compile time for chainlink/mod.rs ==="
echo "If not, production builds would fail. Let's see if there's a workaround or if this is truly broken:"
grep -n 'pub mod testing' magicblock-chainlink/src/lib.rs
grep -n 'testing::\|rpc_client_mock\|cloner_stub' magicblock-chainlink/src/chainlink/mod.rs | wc -lRepository: magicblock-labs/magicblock-validator
Length of output: 4392
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== CRITICAL: Check if magicblock-account-cloner builds successfully ==="
echo "magicblock-account-cloner depends on magicblock-chainlink WITHOUT dev-context feature"
echo "But chainlink/mod.rs imports from testing module which requires dev-context feature"
echo
echo "This would be a compilation failure. Let's verify if there's a workaround:"
echo
echo "=== Check: Is chainlink module itself conditionally compiled based on a feature? ==="
sed -n '1,20p' magicblock-chainlink/src/lib.rs
echo
echo "=== Final check: Can the imports actually be resolved without dev-context? ==="
echo "Listing all use statements that reference testing/mock modules:"
rg -n 'use.*testing|use.*mock' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== If this is truly unconditional, it's a hard blocker. Let's trace dependencies ==="
echo "Does magicblock-account-cloner actually depend on chainlink without dev-context?"
grep -A5 'magicblock-chainlink' magicblock-account-cloner/Cargo.toml || echo "Not found or no features specified"Repository: magicblock-labs/magicblock-validator
Length of output: 925
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== The rg search for use statements might have missed them. Let's check directly ==="
sed -n '20,32p' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== These are the problematic imports. Now verify: does this file compile without dev-context? ==="
echo "The StubbedChainlink type alias uses these concrete types:"
echo " - ChainRpcClientMock"
echo " - ChainPubsubClientMock"
echo " - ClonerStub"
echo
echo "These are defined in testing/rpc_client_mock.rs and testing/cloner_stub.rs"
echo "Both gated with #[cfg(any(test, feature = \"dev-context\"))]"
echo
echo "=== Verify: are these imports conditionally compiled? ==="
sed -n '1,32p' magicblock-chainlink/src/chainlink/mod.rs | head -32
echo
echo "=== If this truly fails to compile without dev-context, the review comment is CORRECT ==="
echo "Let's check one more thing: are the StubbedChainlink alias and stub() method conditional?"
rg -n '#\[cfg' magicblock-chainlink/src/chainlink/mod.rs | head -5Repository: magicblock-labs/magicblock-validator
Length of output: 2379
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== COMPILE TEST: Would this actually fail? ==="
echo "To definitively verify, let's check if there are any #[cfg] guards we missed:"
grep -n 'cfg' magicblock-chainlink/src/chainlink/mod.rs | head -20
echo
echo "=== Also check the remote_account_provider mod for chain_pubsub_client::mock ==="
grep -B5 'chain_pubsub_client::mock::ChainPubsubClientMock' magicblock-chainlink/src/remote_account_provider/mod.rs | head -10
echo
echo "=== Final determination: Are the imports truly unconditional? ==="
echo "Looking at the full use block with context:"
sed -n '18,33p' magicblock-chainlink/src/chainlink/mod.rsRepository: magicblock-labs/magicblock-validator
Length of output: 205
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Direct file read to confirm: are there ANY cfg guards in chainlink/mod.rs at all? ==="
cat magicblock-chainlink/src/chainlink/mod.rs | grep -i '#\[cfg' || echo "NO CFG GUARDS FOUND IN FILE"
echo
echo "=== Summary of findings ==="
echo "1. testing module in lib.rs: #[cfg(any(test, feature = \"dev-context\"))]"
echo "2. chainlink/mod.rs imports unconditionally from testing module"
echo "3. chainlink/mod.rs defines StubbedChainlink with gated concrete types"
echo "4. dev-context is optional, not default"
echo "5. magicblock-account-cloner depends on chainlink WITHOUT dev-context"
echo
echo "RESULT: This is a COMPILATION BLOCKER for release builds"
echo
echo "=== Verify mock module in remote_account_provider ==="
grep -B2 'pub mod mock' magicblock-chainlink/src/remote_account_provider/mod.rs || echo "Not found"
grep -B2 '#\[cfg.*test\].*pub mod mock' magicblock-chainlink/src/remote_account_provider/mod.rs || echo "Not found with cfg guard"Repository: magicblock-labs/magicblock-validator
Length of output: 691
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== VERIFY: Check if chain_pubsub_client::mock is also cfg-gated ==="
grep -n 'pub mod mock' magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs | head -5
echo
echo "=== Show the context around that module declaration ==="
sed -n '20,35p' magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs
echo
echo "=== FINAL CONFIRMATION: The StubbedChainlink type at lines 42-43 is NOT wrapped in cfg ==="
sed -n '40,45p' magicblock-chainlink/src/chainlink/mod.rs
echo
echo "=== Does this mean the crate will fail to compile without dev-context? YES ==="
echo "chainlink/mod.rs unconditionally imports and uses:"
echo " 1. ChainRpcClientMock from testing::rpc_client_mock (cfg gated)"
echo " 2. ClonerStub from testing::cloner_stub (cfg gated)"
echo " 3. ChainPubsubClientMock from remote_account_provider::chain_pubsub_client::mock (cfg gated)"
echo
echo "And defines public type alias and method without cfg guards."
echo "This is a compilation blocker."Repository: magicblock-labs/magicblock-validator
Length of output: 1354
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== CHECK: Line 286 context in chain_pubsub_client.rs - is it cfg-gated? ==="
sed -n '280,295p' magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs
echo
echo "=== The key issue: are the mock/testing modules wrapped in cfg? ==="
echo "Check testing module:"
head -15 magicblock-chainlink/src/testing/mod.rs | grep -i cfg
echo
echo "Check remote_account_provider mock:"
sed -n '283,290p' magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs
echo
echo "=== IF chain_pubsub_client::mock is NOT cfg-gated, then it's always available ==="
echo "Let's verify once more if there's a cfg guard around that module:"
rg -B3 'pub mod mock' magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rsRepository: magicblock-labs/magicblock-validator
Length of output: 1461
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Confirm: How is the crate actually used in practice? ==="
echo "Does any production code depend on chainlink without dev-context?"
grep -r 'magicblock-chainlink' */Cargo.toml | grep -v 'features.*dev-context' | grep -v Cargo.lock | head -5Repository: magicblock-labs/magicblock-validator
Length of output: 569
This code will not compile in release builds without the dev-context feature enabled.
Lines 25–30 unconditionally import ChainRpcClientMock, ChainPubsubClientMock, and ClonerStub from feature-gated modules (#[cfg(any(test, feature = "dev-context"))]). The StubbedChainlink type alias (lines 42–43) and stub() method (lines 509–517) have no conditional guards and reference these concrete types directly. Since dev-context is not in the default features and production crates (magicblock-account-cloner, magicblock-accounts, magicblock-aperture, magicblock-api) depend on chainlink without explicitly enabling this feature, the build will fail.
The imports and type alias must be wrapped with #[cfg(any(test, feature = "dev-context"))]. Alternatively, move StubbedChainlink and stub() into a separate feature-gated module or use conditional compilation throughout.
Additionally, the stub() method returns a full Chainlink with disabled behavior (fetch_cloner = None), creating a silent no-op API footgun where sync paths on the same public type silently do nothing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@magicblock-chainlink/src/chainlink/mod.rs` around lines 25 - 30, Wrap the
test/dev-only imports (ChainRpcClientMock, ChainPubsubClientMock, ClonerStub)
and the StubbedChainlink type alias and stub() function with #[cfg(any(test,
feature = "dev-context"))] (or move them into a feature-gated module) so they
are not referenced in release builds without the dev-context feature; in
addition, change stub() so it does not return a full production Chainlink with
disabled behavior—either return a distinct stub type (e.g., StubbedChainlink)
behind the cfg gate or make stub() private to the gated module to avoid exposing
a silent no-op API path.
| if should_remove { | ||
| trace!( | ||
| pubkey = %pubkey, | ||
| account = %account_fmt, | ||
| account=%format!("{account:#?}"), | ||
| "Removing non-delegated, non-DLP-owned account" |
There was a problem hiding this comment.
Update the trace message to match the predicate.
Line 188 now removes non-delegated accounts unless owner == feature::ID, so "non-DLP-owned" is misleading here.
🪵 Suggested diff
- "Removing non-delegated, non-DLP-owned account"
+ "Removing non-delegated account during bank reset"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if should_remove { | |
| trace!( | |
| pubkey = %pubkey, | |
| account = %account_fmt, | |
| account=%format!("{account:#?}"), | |
| "Removing non-delegated, non-DLP-owned account" | |
| if should_remove { | |
| trace!( | |
| pubkey = %pubkey, | |
| account=%format!("{account:#?}"), | |
| "Removing non-delegated account during bank reset" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@magicblock-chainlink/src/chainlink/mod.rs` around lines 190 - 194, The trace
log message is misleading: when should_remove is true we remove non-delegated
accounts except when owner == feature::ID, so update the trace in mod.rs (the
trace call that logs pubkey and account) to reflect that predicate (reference
variables: should_remove, pubkey, account, owner and feature::ID) — change the
message to indicate "Removing non-delegated account not owned by feature::ID"
(or equivalent wording) so it matches the actual condition.

Summary
Fixed various replication state discrepancies, caused by accountsdb modifications.
Summary by CodeRabbit
New Features
Bug Fixes
Chores
closes: #955