Skip to content

Conversation

@davidv1992
Copy link
Contributor

Primarily intended for discussion, the document may not quite be suitable for merging in its current form.

Copy link
Owner

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some decent ideas in this. I haven't looked at what you've implemented so far, but I think the best way to evaluate the design would be to start to iteratively implement it in this repository, with code review at each step.

}
}

trait Service {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a need for a Service trait at this point in time. The authentication and connection service can just implement whatever API is needed, since we're definitely not in a phase where we're going to need arbitrary services. That would allow these services to implement an API befitting their semantic status rather than having to funnel through a shared API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the service trait to be honest, not because there are that many implementations, but because it allows very easy testing of the transport layer on its own as it creates a clear separation boundary. The alternative is that the transport layer inverts on those function, but that pretty much seems to force you down a path where buffers would need to be excessively copied.

The interface for both services really is the same, as that is how services are designed to work in the RFC specification. From the perspective of the transport layer protocol a service really is just a black box handling packets and perhaps returning some.

Are there specific ways you were thinking might be useful to have bespoke apis for the two services for how they take their input?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the service trait to be honest, not because there are that many implementations, but because it allows very easy testing of the transport layer on its own as it creates a clear separation boundary.

Types can create a clear separation boundary too, no need for a trait.

A trait only makes sense when there is code that needs to abstract over the shared interface. Do you have code that takes a &mut impl Service or similar in your fork right now? If not, it doesn't seem worth it.

///
/// Also handles the NEW_KEYS messages in both directions, and the identification string.
/// May also send connection closure messages on receiving garbage.
struct SshCryptoConnection {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to sort of imply having I/O live with buffers/state, whereas I think they might be better off separately. Have you looked at #81 to see what my design looks like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you are getting at here, and honestly getting a bit lost in seeing what your goal state in #81 exactly is. Perhaps I'm just missing the forest for the trees there, but could you be a bit more explicit with examples?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The send_packet() and recv_packet() methods seem like they're abstracting over I/O, while poll_transmit() seems more sans-I/O oriented. In #81, I keep the TcpStream separate from ReadState which IMO results in a cleaner design.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SshCryptoConnection wouldn't contain a TcpStream. It receives packets through send_packet and encrypts them and then you can read the encrypted data out if it to send using poll_transmit and vice versa for receiving packets from the remote.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think putting those in different methods implies a bunch of buffering inside, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffering for the packet receive side is going to be necessary anyway. Buffering for the packet send side makes things easier I think. Especially with respect to encryption key changeovers. And it is going to be necssary for large packets that can't be sent with a single stream.send() call. aws-lc-rs doesn't support streaming for AEADs, so we need to buffer the encrypted data.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, good point. I think we'll still want to split up ReadState and WriteState.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can still be done internally in this struct if desired. The thing is especially once we start doing strict-kex there might be some operations that we want to coordinate between the read and write encrypted channels.

}
```

Although within a Sans-IO context likely unavoidable, the choice to use identifiers for the individual channels leads to quite significant potential for misuse of the API. Examples of this are forgetting to properly close channels before dropping the identifier, and trying to use an identifier with a different connection from which it was originated. As seen below, these difficulties extend to the public interface.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think this is unavoidable? What alternative design would do better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a usability perspective, something build on an async runtime could definitely do better by hiding most of the mess here from the user. See quinn vs quinn-proto.

And when implementing that directly in async code, there likely never is a point where two connections are available internally in the api so the bugs cannot be made there either in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for why it seems unavoidable in the sans-io context, the reason is that you want a usable interface where multiple channels can be interacted with simultaneously. That, combined with the fact that connection state will need to be mutated when stuff happens around channels makes it impossible to have any sort of long-lived back-reference to the connection in the channel object end user code holds. And rusts type system is not really suitable for tying a channel to a specific connection, leaving little other options as far as I can see.

@davidv1992 davidv1992 requested a review from djc January 14, 2026 15:54
@djc
Copy link
Owner

djc commented Jan 15, 2026

This document presents a Sans-IO inspired design for an SSH connection library. Guiding goals for design were, from most to least important

  • Not do any IO nor synchronisation within the library.
  • Match in broad strokes the design of the SSH standard to simplify implementation.
  • Ensure testability of individual components of the protocol.
  • Maximize opportunity for eliminating copying of data.

I think these are good goals, but ultimately discussing a high-level API in the abstract doesn't seem very valuable at this point. IMO it would be more useful to start taking off chunks from what you've implemented and actually port them to this repo in PRs of perhaps a few hundred LOCs (or smaller), to consider the trade-offs in more detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants