-
Notifications
You must be signed in to change notification settings - Fork 236
feat!: concrete errors #3161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat!: concrete errors #3161
Conversation
@@ -37,6 +37,48 @@ pub(crate) mod streams; | |||
#[cfg(not(wasm_browser))] | |||
mod util; | |||
|
|||
/// Client related errors | |||
#[allow(missing_docs)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is much value in documenting these variants currently, which is why I am skipping them
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3161/docs/iroh/ Last updated: 2025-05-21T16:11:43Z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think fundamentally I'm not happy with these giant enum errors. I think at every point an error can be returned, each error variant of the type must be possible. The approach here just makes every source error a variant without any concern on whether this really helps anything.
96176a3
to
81f24ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a little bit of trouble evaluating this PR. Not sure how to decide what kinds of errors are good / how to best group them etc.
I liked the change towards .expect
in some places, where it'd really just be a bug in our code if there were an error.
Anyhow - given we want to move forward with this, we should get this into main. I can see some git conflicts on the horizon for two of my WIP branches 🙃
I think this is something we should explore, there is no fixed rule that works for every project I have found so far. |
74d55e7
to
a4437f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am struggling to figure out how to be constructive on this. I still don't particularly like this error style. I think it does not think enough about the human consumer. The error variants inside single enums are often of widely different "types".
OTOH I don't feel like particularly holding this up. Though I am a little worried if it becomes the model we use everywhere. If I were to properly handle this I'd create a counter-proposal as PR... but that takes time. Which I guess is part of the question, my approach would be a much more time-intensive I expect as it really asks for a deep understanding of each API.
Anyway, eh, feel free to merge?
@@ -9,8 +9,8 @@ use std::{ | |||
}; | |||
|
|||
use curve25519_dalek::edwards::CompressedEdwardsY; | |||
pub use ed25519_dalek::Signature; | |||
use ed25519_dalek::{SignatureError, SigningKey, VerifyingKey}; | |||
pub use ed25519_dalek::{Signature, SignatureError}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be any regrets about re-exporting these without wrapping?
iroh-relay/src/client.rs
Outdated
#[error("Invalid relay URL {0}")] | ||
InvalidRelayUrl(Url), | ||
#[error(transparent)] | ||
Websocket(#[from] tokio_tungstenite_wasm::Error), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this error is self-describing. Maybe an #[error("Websocket transport error: {0}")]
would be better? (If that is the correct interpretation)
Which immediately makes me wonder "but what if I'm not using websocket"? Why is there no generic IO or Transport error this falls under?
iroh-relay/src/client.rs
Outdated
InvalidTlsServername, | ||
#[error("No local address available")] | ||
NoLocalAddr, | ||
#[error("tls connection failed")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[error("tls connection failed")] | |
#[error("TLS connection failed")] |
#[allow(missing_docs)] | ||
#[derive(Debug, thiserror::Error)] | ||
#[non_exhaustive] | ||
pub enum DialError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading just the error types I am rather confused what the difference between ConnectError
and DialError
is. Sounds like the same to me.
It seems DialError
is a variant of ConnectError
so maybe it's more of a ConnectionError
? But then also ConnectError
is returned from Client::connect
so it probably was meant as ConnectError
😵💫
iroh-relay/src/client/conn.rs
Outdated
#[error(transparent)] | ||
Websocket(#[from] tokio_tungstenite_wasm::Error), | ||
#[error("invalid protocol message encoding")] | ||
InvalidProtocolMessageEncoding, | ||
#[error("Unexpected frame received {0}")] | ||
UnexpectedFrame(crate::protos::relay::FrameType), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Websocket
seems like part of the Io
variant?
And InvalidProtocolMessageEncoding
and UnexpectedFrame
are part of the Protocol
variant?
I think the io/protocol
distinction is pretty decent on a high-level and that was what I was going for in the ConnSendError
.
Also it might be possible to have ConnSendError
and RecvError
be the same type?
Unfortunately I found a massive drawback to using The reasons for this is that tldr rust-lang/rust#99301 is not stable yet and error libraries can't propagate backtrace information properly. This works with anyhow because it uses a singular error type, which can poperly propagate backtraces. Until I can find a solution for this, using error enums is for now unacceptable. If you ask, Yes I probably looked at your favourite error handling library, they all fail to do this on STABLE rust, including
|
3c0bb19
to
67c75b0
Compare
## Description Use a new macro from nested-enum-utils to DRY the strum error enums. One example. ## Breaking Changes Adds a dependency. ## Notes & open questions Should we reexport common_fields from n0_snafu? ## Change checklist <!-- Remove any that are not relevant. --> - [ ] Self-review. - [ ] Documentation updates following the [style guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text), if relevant. - [ ] Tests if relevant. - [ ] All breaking changes documented. - [ ] List all breaking changes in the above "Breaking Changes" section. - [ ] Open an issue or PR on any number0 repos that are affected by this breaking change. Give guidance on how the updates should be handled or do the actual updates themselves. The major ones are: - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc) - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip) - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs) - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe) - [ ] [`sendme`](https://github.com/n0-computer/sendme)
2bdae04
to
42a8d73
Compare
## Description I used to add `snafu(visibility(pub(crate)))` for no reason apart from consistency. Not worth doing that!
## Description What it says on the tin.
marking as ready for review, as this needs mostly review and cleanup now |
…ing the expected `NodeId` out of the `ServerName` in verifiers (n0-computer#3290) ## Description This is code that will both update deps like n0-computer#3288 but also makes 0-RTT tests pass again. They were failing due to a [change in `rustls`](https://github.com/rustls/rustls/releases/tag/v%2F0.23.24): > Behavior change: Clients no longer offer resumption between different `ClientConfig`s that share a resumption store but do not share server certificate verification and client authentication credentials. If you share a resumption store between multiple `ClientConfig`s, please ensure their server certificate verification and client authentication credentials are also shared. Please read the new documentation on the `ClientConfig::resumption` item for details. > > Additionally, if you share a resumption store or ticketer between multiple `ServerConfig`s, please see the new documentation on `ServerConfig` about this. Essentially what they're doing is [checking that the certificate verifiers are `Arc::prt_eq` and depending on that allow or disallow 0-RTT](https://github.com/rustls/rustls/pull/2361/files#diff-9d58cb90e1df25604b88544aca7ea999ccd23bc5657384e196e77f407c0f4acfR256). When running the tests with rustls logging enabled, I confirmed that this is indeed what's happening: > [TRACE rustls::msgs::persist] resumption not allowed between different ServerCertVerifiers --- The fix isn't straight-forward. The reason we're creating new `ServerVerifier`s every time is that we're passing in the expected remote node ID every time. There used to be no other way of grabbing the other node's ID in the verifier, until, coincidentally n0-computer#3146 landed, which changed the server name that's set when connecting from always being `localhost` to `<base32 NodeId>.iroh.invalid`. I'm now pulling in that value from the `ServerName` that's passed to the `ServerCertificateVerifier` in `verify_server_cert`. As far as I understand that value *should* be a client-set value not a value we get over the network. The downside of doing this is that we'll not be compatible with iroh versions that don't use the `<base32 NodeId>.iroh.invalid` server names. --- There's an unrelated change in here that's changing `make_client_config` and `make_server_config` infallible, as was done in n0-computer#3161, which will definitely also have merge conflicts with this PR. ## Breaking Changes - `Router::spawn` is now a plain function instead of an `async fn` - `Router::spawn` is now infallible, instead of returning `anyhow::Result<()>` ## Change checklist <!-- Remove any that are not relevant. --> - [x] Self-review. - [x] Documentation updates following the [style guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text), if relevant. - [x] All breaking changes documented. - [x] List all breaking changes in the above "Breaking Changes" section. --------- Co-authored-by: dignifiedquire <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posting some of my review comments so far (quite a big PR).
[patch.crates-io] | ||
n0-snafu = { git = "https://github.com/n0-computer/n0-snafu", branch = "expand-test-error" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to remember to replace this with a published version
pkarr = { version = "3.7", features = ["relays", "dht"], default-features = false } | ||
rcgen = "0.13" | ||
redb = "2.0.0" | ||
redb = "=2.4.0" # 2.5.0 has MSRV 1.85 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redb = "=2.4.0" # 2.5.0 has MSRV 1.85 | |
redb = ">= 2.0.0, < 2.5" # 2.5.0 has MSRV 1.85 |
use clap::{Parser, ValueEnum}; | ||
use iroh::{ | ||
discovery::dns::{N0_DNS_NODE_ORIGIN_PROD, N0_DNS_NODE_ORIGIN_STAGING}, | ||
dns::DnsResolver, | ||
NodeId, | ||
}; | ||
use n0_snafu::{TestResult as Result, TestResultExt}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we shouldn't call it TestResult
then 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I welcome better names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just Result
? :D
Maybe ReportResult
?
@@ -29,7 +28,7 @@ pub async fn run_with_config_until_ctrl_c(config: Config) -> Result<()> { | |||
store = store.with_mainline_fallback(bootstrap); | |||
}; | |||
let server = Server::spawn(config, store, metrics).await?; | |||
tokio::signal::ctrl_c().await?; | |||
tokio::signal::ctrl_c().await.context("ctrl-c")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These .context
s aren't super fun. Should we change them to .e()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
Description
Fixes #2741
Breaking Changes
Notes & open questions
Change checklist