-
Notifications
You must be signed in to change notification settings - Fork 5
Implement PeerNetworkInterface #339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I'd like to run this on my computer, if that's possible. Can you give some detailed instructions in the PR for how to do that, please? |
Sorry for the delay, I missed this yesterday. I just attached setup instructions. |
Thanks. And how will I know that it's working? That it prints certain log messages? You mention killing and reviving the other nodes. Can you briefly describe that sequence and what I should expect? |
Yep! The main method I used to test fallover was starting from the origin (by setting
There's other points it can start from too:
|
I ran this before I started any cardano nodes. It probably shouldn't panic, but rather print something and exit nicely. |
|
Sorry, but I'm stuck on setting up the environment for this test. I got as far as trying to startup the cardano containers. Your gist hints at nice things, but doesn't explain how to use them. I realize this is sort of common knowledge for most cardano developers so perhaps point to a document somewhere. Or make a nice gist the explains it all and then just reference that in the future for testing. |
|
Okay, figured out that I have to run restore.sh before I run startup.sh to create the configuration directories. But db fails because there is no aarch64 for mithril. Seems to still startup but I imagine mithril chain fetch won't work. |
|
I was able to start the cardano images, but when I run the omni-bus process, I still get: So I need some help knowing what creates the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a new peer-network-interface module that provides a more robust alternative to the existing upstream chain fetcher. The module uses the ChainSync and BlockFetch protocols to fetch blocks from multiple configured upstream peers, following one preferred chain while supporting graceful failover to other peers during network issues.
Key changes:
- Introduces the
PeerNetworkInterfacemodule with event-driven architecture supporting multiple upstream peers - Refactors
UpstreamCacheintocommonfor reuse across both upstream chain fetcher implementations - Adds support for the preview network in genesis bootstrapper
Reviewed Changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| modules/peer_network_interface/src/peer_network_interface.rs | Main module implementation handling initialization, cache management, and block publishing |
| modules/peer_network_interface/src/network.rs | NetworkManager coordinating multiple peer connections and chain state |
| modules/peer_network_interface/src/chain_state.rs | ChainState tracking block announcements, rollbacks, and publishing queue across multiple peers |
| modules/peer_network_interface/src/connection.rs | PeerConnection managing individual peer connections using ChainSync and BlockFetch protocols |
| modules/peer_network_interface/src/configuration.rs | Configuration loading and sync point options |
| modules/peer_network_interface/config.default.toml | Default configuration with mainnet backbone nodes |
| modules/peer_network_interface/Cargo.toml | Package definition and dependencies |
| modules/peer_network_interface/README.md | Module documentation and usage guide |
| modules/peer_network_interface/NOTES.md | Architecture diagram and design notes |
| common/src/upstream_cache.rs | Refactored cache implementation moved from upstream_chain_fetcher for reuse |
| common/src/lib.rs | Export upstream_cache module |
| common/src/genesis_values.rs | Added kebab-case serde attribute for configuration deserialization |
| modules/upstream_chain_fetcher/src/upstream_chain_fetcher.rs | Updated to use refactored UpstreamCache from common |
| modules/upstream_chain_fetcher/src/body_fetcher.rs | Updated imports for UpstreamCache |
| modules/genesis_bootstrapper/src/genesis_bootstrapper.rs | Added preview network genesis support |
| modules/genesis_bootstrapper/build.rs | Download preview network genesis files |
| processes/omnibus/src/main.rs | Register PeerNetworkInterface module |
| processes/omnibus/Cargo.toml | Add peer_network_interface dependency |
| processes/omnibus/omnibus-local.toml | Local configuration for testing with preview network |
| processes/omnibus/.gitignore | Ignore upstream-cache directory |
| Cargo.toml | Add peer_network_interface to workspace members |
| Cargo.lock | Lock file updates for new module |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
It looked like the |
Fixes #204
Implements a
peer-network-interfacemodule. This module runs the ChainSync and BlockFetch protocols against a small set of explicitly-configured peers. It follows the fork defined by the first peer in the list, but will switch to other forks if that peer disconnects.Testing strategy was a combination of unit tests of the
ChainStatestruct, and manual testing against three preview nodes on my laptop which I randomly killed and revived.Includes a small architecture diagram: https://github.com/input-output-hk/acropolis/blob/sg/peer-network-interface/modules/peer_network_interface/NOTES.md
Manual testing
To test it, you can run the omnibus process using the "local" configuration:
cd processes/omnibus cargo run -- --config omnibus-local.tomlThat configuration tries connecting to three Cardano nodes running against the
previewenvironment, on ports 3001 3002 and 3003. To create such a setup, you can use this gist https://gist.github.com/SupernaviX/16627499dae71092abeac96434e96817Hoisted comments...
The main method I used to test fallover was starting from the origin (by setting sync-point = "origin" in omnibus-local.toml), and then using docker stop and docker start to stop and start the three cardano nodes while it synced. The module emits a log line for every 1000 messages it produces, which happens pretty rapidly when syncing from origin. It also emits log lines when a node is disconnected (and tries reconnecting every 5 seconds). So what you see in logs with this setup is
There's other points it can start from too:
#341)
from the tip (pretty reliable, but blocks only come every ~20 seconds so the logs don't show many signs of life)