Skip to content

Conversation

@Jakio815
Copy link
Collaborator

@Jakio815 Jakio815 commented Jan 15, 2025

lf-lang/lingua-franca#2455

Summary

This PR adds a network interface layer to support interoperability in terms of network protocols, and security, using the net_abstraction.

The original code is very tied up with sockets, making it hard to add different network features.
However, I clarify that this PR does not remove all socket related functions out of the main flow from the RTI and federate, for two reasons.

  1. The RTI and federate support user-specified port numbers and also IP addresses for the federate.
  2. The MSG_TYPE_ADDRESS_ADVERTISEMENT and MSG_TYPE_ADDRESS_QUERY uses the port and IP address in the protocol itself.

We can completely take these socket-related stuff using #ifdef COMM_TYPE_TCP; however, I highlight that this PR is more concentrated on supporting end-to-end pluggable and interchangeable network security based on TCP. Thus, in this PR, I have not added the #ifdef guards, and have not changed the protocol for MSG_TYPE_ADDRESS_ADVERTISEMENT and MSG_TYPE_ADDRESS_QUERY.

Features

Plugin API for network

This PR creates a separate-compiled library on the network instead of a part of the core runtime.
I followed the prior work on low_level_platform.h and platform.h.
All source files related to network is moved under network/impl/src, and all headers are under network/api. Also there are separate CMakelists.txt for each.

Add COMM_TYPE target property.

The comm-type keyword is available in the C target as follows.

target C {
  comm-type: TCP
}

Currently only supports TCP, and plan to support SST for security.

Refactoring on clock-synchronization.

There is no other reason to do clock synchronization besides UDP. So, I left all clock-sync functions to directly use UDP sockets, and refactored these functions.

rti_remote.c

  • send_physical_clock() and handle_physical_clock_sync_message() : Uses boolean flag use_UDP when UDP socket used.

clock-sync.c

  • handle_T4_clock_sync_message() : Uses boolean flag use_UDP when UDP socket used.
  • handle_T1_clock_sync_message(): Use void* socket_or_net_abstraction as parameter to support both socket and network abstraction.

handle_address_ad() and handle_address_query()

There are no changes in the protocol.

As explained in the summary, MSG_TYPE_ADDRESS_ADVERTISEMENT and MSG_TYPE_ADDRESS_QUERY uses port numbers and IP addresses in the protocol itself. To explain further, for physical connections or decentralized coordination, the federateA has to know the peer federateB's port number and IP address to directly connect to it. This is done by federateA sending a MSG_TYPE_ADDRESS_ADVERTISEMENT to the RTI it's port, and the RTI sends a MSG_TYPE_ADDRESS_QUERY_REPLY message to the peer federateB, including the port number and IP address. Thus, the port number and IP address itself cannot be encapsulated under the network abstraction layer, as it is included in the protocol.

Therefore, there are some get() and set() calls to the network interface.

  1. FedA - > RTI: MSG_TYPE_ADDRESS_ADVERTISEMENT: FedA calls get_my_port() to send its own port to the RTI.
  2. FedB -> RTI: MSG_TYPE_ADDRESS_QUERY : No changes.
  3. RTI -> FedB: MSG_TYPE_ADDRESS_QUERY_REPLY : RTI calls get_server_port() and get_ip_addr() to encode it to the message to send to FedB.
  4. FedB: Sets the received port and IP address by set_server_port() and set_server_host_name() to directly connect to FedA.

Minor logic change: Move getpeername() logic to accept_net_abstraction()

The RTI should know the connected federate(FedA)'s IP address, to pass the IP address to the other federate(FedB), as explained above.

Before, this was done in rti_remote.c's receive_and_check_fed_id_message(). I moved this to lf_socket_support.c's accept_net_abstraction(), because it looks more appropriate to set the connected peer's information inside this function.

One inefficiency that happens is that in decentralized coordination, the server federate does not need to save the connected peer federate's IP address. However, I think this will barely affect the performance.

Add default UDP port number as 15061.

The UDP port was usually set to the RTI's port + 1, in rti_remote.c's create_server() function call. However, I did not want to expose the port number in the create_server() interface, so I took the parameter port out from create_server().

Minor changes

  • The start_rti_server() function does not get the port as a parameter. The port will be saved in rti_remote_t;s user_specified_port, with a default value when not set up.

@Jakio815 Jakio815 changed the base branch from main to shutdown January 15, 2025 17:32
Jakio815 and others added 24 commits January 15, 2025 11:58
…create_clock_server, because there are no plans using other network stacks rather than UDP.
…e. && Change all read() write(), and shutdown() to use netdrv
…work driver is initialized, and send default values when not initialized.
Comment on lines -1528 to -1542
if (errno == ECONNRESET) {
lf_print_error("Socket connection to the RTI was closed by the RTI without"
" properly sending an EOF first. Considering this a soft error.");
// NOTE: If this happens, possibly a new RTI must be elected.
shutdown_socket(&_fed.socket_TCP_RTI, false);
return NULL;
} else {
lf_print_error("Socket connection to the RTI has been broken with error %d: %s."
" The RTI should close connections with an EOF first."
" Considering this a soft error.",
errno, strerror(errno));
// NOTE: If this happens, possibly a new RTI must be elected.
shutdown_socket(&_fed.socket_TCP_RTI, false);
return NULL;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside read_from_netchan(), the errno will be printed when read_from_socket() fails.

@Jakio815 Jakio815 changed the title Draft: Create network channel. Add Network Abstraction layer. Nov 7, 2025
@Jakio815 Jakio815 marked this pull request as ready for review November 7, 2025 21:15
@Jakio815 Jakio815 requested review from edwardalee and hokeun November 7, 2025 21:15
Copy link
Contributor

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't made it all the way though, but it's looking pretty good! One issue is that while I think "network abstraction" is a good term to use for the general architecture, I find the resulting names unnecessarily long. I suggest dropping the "abstraction" in most cases. E.g.:

fed->fed_net_abstraction -> fed->net
write_to_net_abstraction -> write_to_net
write_to_net_abstraction_fail_on_error -> write_to_net_fail_on_error
read_from_net_abstraction_fail_on_error -> read_from_net_fail_on_error

@Jakio815
Copy link
Collaborator Author

Jakio815 commented Nov 14, 2025

I haven't made it all the way though, but it's looking pretty good! One issue is that while I think "network abstraction" is a good term to use for the general architecture, I find the resulting names unnecessarily long. I suggest dropping the "abstraction" in most cases. E.g.:

fed->fed_net_abstraction -> fed->net write_to_net_abstraction -> write_to_net write_to_net_abstraction_fail_on_error -> write_to_net_fail_on_error read_from_net_abstraction_fail_on_error -> read_from_net_fail_on_errorHello

Hello Prof. Lee, thank you for your comment. I shortened most of the names.
The one thing I haven't changed was the name of the type, net_abstraction_t.
Besides this, most of them are shortened.

net_abstractions_for_inbound_p2p_connections -> net_for_inbound_p2p_connections
net_abstractions_for_outbound_p2p_connections -> net_for_outbound_p2p_connections
lf_outbound_net_abstraction_mutex -> lf_outbound_net_mutex
fed->fed_net_abstraction --> fed->net
net_abstraction_to_RTI -> net_to_RTI
RTI_net_abstraction_listener -> RTI_net_listener


#include "util.h"
#include "socket_common.h"
#include "util.h" // LF_MUTEX_UNLOCK(), logging.h
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#include "util.h" // LF_MUTEX_UNLOCK(), logging.h
#include <logging.h>
#include "util.h" // LF_MUTEX_UNLOCK()

}

void send_physical_clock(unsigned char message_type, federate_info_t* fed, socket_type_t socket_type) {
void send_physical_clock(unsigned char message_type, federate_info_t* fed, bool use_UDP) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think socket_type == UDP is more readable than true. I recommend changing this back to using the socket_type_t enum unless there is a pressing reason to change this to bool.

}

int handle_T1_clock_sync_message(unsigned char* buffer, int socket, instant_t t2) {
int handle_T1_clock_sync_message(unsigned char* buffer, void* socket_or_net, instant_t t2, bool use_udp) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument name here is inconsistent with the one above: use_UDP vs. use_udp.

@hokeun hokeun changed the title Add Network Abstraction layer. Addition of network abstraction layer to separate socket implementation code from network communication logic Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants