-
-
Notifications
You must be signed in to change notification settings - Fork 2
Design proposal for SSH connection implementation. #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Design proposal for SSH connection implementation. #82
Conversation
djc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are some decent ideas in this. I haven't looked at what you've implemented so far, but I think the best way to evaluate the design would be to start to iteratively implement it in this repository, with code review at each step.
| } | ||
| } | ||
|
|
||
| trait Service { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a need for a Service trait at this point in time. The authentication and connection service can just implement whatever API is needed, since we're definitely not in a phase where we're going to need arbitrary services. That would allow these services to implement an API befitting their semantic status rather than having to funnel through a shared API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the service trait to be honest, not because there are that many implementations, but because it allows very easy testing of the transport layer on its own as it creates a clear separation boundary. The alternative is that the transport layer inverts on those function, but that pretty much seems to force you down a path where buffers would need to be excessively copied.
The interface for both services really is the same, as that is how services are designed to work in the RFC specification. From the perspective of the transport layer protocol a service really is just a black box handling packets and perhaps returning some.
Are there specific ways you were thinking might be useful to have bespoke apis for the two services for how they take their input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the service trait to be honest, not because there are that many implementations, but because it allows very easy testing of the transport layer on its own as it creates a clear separation boundary.
Types can create a clear separation boundary too, no need for a trait.
A trait only makes sense when there is code that needs to abstract over the shared interface. Do you have code that takes a &mut impl Service or similar in your fork right now? If not, it doesn't seem worth it.
| /// | ||
| /// Also handles the NEW_KEYS messages in both directions, and the identification string. | ||
| /// May also send connection closure messages on receiving garbage. | ||
| struct SshCryptoConnection {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to sort of imply having I/O live with buffers/state, whereas I think they might be better off separately. Have you looked at #81 to see what my design looks like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what you are getting at here, and honestly getting a bit lost in seeing what your goal state in #81 exactly is. Perhaps I'm just missing the forest for the trees there, but could you be a bit more explicit with examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The send_packet() and recv_packet() methods seem like they're abstracting over I/O, while poll_transmit() seems more sans-I/O oriented. In #81, I keep the TcpStream separate from ReadState which IMO results in a cleaner design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SshCryptoConnection wouldn't contain a TcpStream. It receives packets through send_packet and encrypts them and then you can read the encrypted data out if it to send using poll_transmit and vice versa for receiving packets from the remote.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think putting those in different methods implies a bunch of buffering inside, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffering for the packet receive side is going to be necessary anyway. Buffering for the packet send side makes things easier I think. Especially with respect to encryption key changeovers. And it is going to be necssary for large packets that can't be sent with a single stream.send() call. aws-lc-rs doesn't support streaming for AEADs, so we need to buffer the encrypted data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, good point. I think we'll still want to split up ReadState and WriteState.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can still be done internally in this struct if desired. The thing is especially once we start doing strict-kex there might be some operations that we want to coordinate between the read and write encrypted channels.
| } | ||
| ``` | ||
|
|
||
| Although within a Sans-IO context likely unavoidable, the choice to use identifiers for the individual channels leads to quite significant potential for misuse of the API. Examples of this are forgetting to properly close channels before dropping the identifier, and trying to use an identifier with a different connection from which it was originated. As seen below, these difficulties extend to the public interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think this is unavoidable? What alternative design would do better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a usability perspective, something build on an async runtime could definitely do better by hiding most of the mess here from the user. See quinn vs quinn-proto.
And when implementing that directly in async code, there likely never is a point where two connections are available internally in the api so the bugs cannot be made there either in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for why it seems unavoidable in the sans-io context, the reason is that you want a usable interface where multiple channels can be interacted with simultaneously. That, combined with the fact that connection state will need to be mutated when stuff happens around channels makes it impossible to have any sort of long-lived back-reference to the connection in the channel object end user code holds. And rusts type system is not really suitable for tying a channel to a specific connection, leaving little other options as far as I can see.
I think these are good goals, but ultimately discussing a high-level API in the abstract doesn't seem very valuable at this point. IMO it would be more useful to start taking off chunks from what you've implemented and actually port them to this repo in PRs of perhaps a few hundred LOCs (or smaller), to consider the trade-offs in more detail. |
Primarily intended for discussion, the document may not quite be suitable for merging in its current form.