Mralj/ipc state provider reipc #8

mralj · 2025-03-03T09:43:40Z

📝 Summary

rbuilder can now work with any node via IPC.
This means that any node can provide the state for rbuilder, revm is still used as the EVM.
Node requirements are that it exposes the following RPC calls:

Calls not available in the ETH JSON RPC spec:

rbuilder_calculateStateRoot for state root calculation
rbuilder_getCodeByHash given Bytecode hash, returns Bytecode

Calls optimised for rbuilder, but have a counterpart in the ETH JSON RPC spec:

rbuilder_getBlockHash gets the block hash given block number (similar to eth_getBlockByNumber, but just returns block hash)
rbuilder_getAccount gets account info (similar to eth_getProof, but w/o proof stuff)

To use rbuilder with node via IPC, the config.toml must have the following config (example):

[ipc_provider]
ipc_path = "/root/execution-data/nethermind.ipc"
mempool_server_url = "ws://localhost:8546"
request_timeout_ms = 75

Implementation details

IPC

This implementation was initially intended to introduce a remote state provider. By remote, I mean that the idea was that state could be provided via HTTP/WS/IPC. Unfortunately, due to implementation issues/constraints, I've decided only to implement state provisioning via IPC.
I don't think this has any practical downside, especially since the state provider must be fast. There is a non-trivial number of calls to read state (~300/s), meaning it would be unrealistic to use this over the network and have near disk read latency.

Code-wise, issues above and constraints stem mainly from the fact that the traits to read state are sync. Initially, I relied on tokio and alloy.rs to fetch this remote state, but this implementation had many issues.
Firstly, each call to fetch any data (e.g. fetching account or Bytecode) had to be wrapped in a function call like this:

    /// Runs fututre in sync context
    // StateProvider(Factory) traits require sync context, but calls to remote provider are async
    // What's more, rbuilder is executed in async context, so we have a situation
    // async -> sync -> async
    // This helper function allows execution in such environment
    fn run<F, R>(&self, f: F) -> R
    where
        F: Future<Output = R>,
    {
        tokio::task::block_in_place(|| self.runtime_handle.block_on(f))
    }

This adds additional overhead on Tokio runtime and doesn't play well with some parts of the codebase, specifically mutex locking. Not to go too deep into explaining issues around this in the PR description, but we would end up in scenarios where the whole Tokio runtime I/O would be blocked or we would dead-lock parking-lot mutexes (in some scenarios).

The solutions (monitoring thread + async mutex) seemed hacky and suboptimal.
This is why, in the end, I reached for sync (but concurrent) IPC solution, which I implemented from scratch here. It's called REIPC (coming from request/response IPC).

This solution was tested using Nethermind node, and while Nethermind will do some improvements on IPC to reduce latency and increase throughput, here are the initial request&response latencies:

Dashmap & QuickCache in IPC provider

We need caches because otherwise the number of IPC calls would be pretty high (in thousands per sec).
I also reached for concurrent caches because both StateProviderFactory and StateProvider need to be Send + Sync.

QuickCache is used so that I don't have to implement concurrent-cache invalidation by hand :)
Here is some info on QuickCache vs Moka (other popular caching crate), TL;DR; for our simple case QuickCache seems a better fit.

On `enum StateProviderFactories`

The reason cli.rs passes StateProviderFactories enum to config.new_builder is because I wanted the state provider for rbuilder to be chosen via config at the runtime.
This is why, AFAIK, static dispatch is not an option. So I was left with the choice of refactoring to dynamic dispatch (akin to StateProviderBox) OR the enum solution.
I chose the enum solution for following reasons:

It seemed to me that code diff would be smaller (smaller change to implement)
AFIAK it's faster. Enum matching will compile to JMP vs CALL (in case of dynamic dispatch), which compiler will be able to optimize better (especially in this scenario), and given branch predicitoning I guess that it'll be almost free (+ no need for vtable/pointer loading).

On `MempoolSource` enum

The reason I chose to use WebSockets for streaming transactions when rbuilder uses IPC state provider is because, currently, REIPC doesn't support ETH subscriptions/streams.

✅ I have completed the following steps:

Run make lint
Run make test
Added tests (if applicable)

crates/rbuilder/src/live_builder/order_input/mod.rs

SozinM · 2025-03-05T08:23:07Z

crates/rbuilder/src/live_builder/order_input/txpool_fetcher.rs

-    let provider = ProviderBuilder::new().on_ipc(ipc).await?;
+    let mempool = config
+        .mempool_source
+        .ok_or_else(|| eyre::eyre!("No TX source configured"))?;


No txpool source configured

And if we fail on empty mempool - let's fail on config parsing stage if possible?

And if we fail on empty mempool - let's fail on config parsing stage if possible?

I have followed logic of already existing code. My understanding is that failing during config parsing would be undesirable because rbuilder still can receive orders (transacitons) via bundle API