fastdds 2.14.0 The shared memory mode cannot communicate after restarting the process #5146
Replies: 11 comments
-
Beta Was this translation helpful? Give feedback.
-
Hi @zhangzhen5729,
You would also need to clean the folder containing shared memory files because it's probably full. |
Beta Was this translation helpful? Give feedback.
-
@elianalf If the program has no console and runs in the background, how can we solve the problem of releasing resources after an unexpected crash? This situation can also lead to successful topic subscription, but no data can be received. |
Beta Was this translation helpful? Give feedback.
-
Hello, I am also experiencing the problem. |
Beta Was this translation helpful? Give feedback.
-
There are many signals to handle all kind of situations, even if the program has no console.
Fast DDS already does the cleanup and releases the resources when the application is correctly closed. When the application does not correctly close, and the error signal is not handled, the internal cleanup is not called and a manual cleanup is necessary. |
Beta Was this translation helpful? Give feedback.
-
If the program crashes unexpectedly and no signal is captured, and the FASTDDS resources are not released, resulting in the inability to receive data after restart, what should be done? Should FASTDDS continue to enhance the fault tolerance of unexpected crashes of SHM communication participants? |
Beta Was this translation helpful? Give feedback.
-
@elianalf Hello, under Windows, if the program crashes unexpectedly, how can you capture the process crash or exit signal? |
Beta Was this translation helpful? Give feedback.
-
It is good to know that FastDDS stores files in
The function checks the directory for files. If the files have already been opened by a programme with FastDDS, they are ignored, otherwise they are deleted. As this is executed before the FastDDS code, problematic files are deleted completely. |
Beta Was this translation helpful? Give feedback.
-
Hi @zhangzhen5729, @baynaaMN, @OgreTransporter.
The following code is a slightly modified snippet example of signal handling taken from the Fast DDS (master) hello world example (main.cpp). It applies to Linux, MacOS, and Windows: #include <csignal>
std::function<void(int)> stop_app_handler;
void signal_handler(
int signum)
{
stop_app_handler(signum);
}
int main(
int argc,
char** argv)
{
// App initialization
// ...
// Implementation of your signal handler
stop_app_handler = [&](int signum)
{
std::cout << "\nSignal #" << std::to_string(signum) << " received, stopping application." << std::endl;
// Call application destruction methods here
// ...
};
// Examples of handled signals, some of them are not supported in windows
signal(SIGINT, signal_handler);
signal(SIGTERM, signal_handler);
#ifndef _WIN32
signal(SIGQUIT, signal_handler);
signal(SIGHUP, signal_handler);
#endif // _WIN32
// Application loop
// ...
return 0;
}
There is no need if the application is correctly closed. The created files are associated with the identifiers (GUIDs) of the different DDS entities, and their corresponding ports. The newly created entities will not be allowed to overwrite the previous files, even though the identifiers and ports are the same. For that reason, those files should be removed once the entity is removed (task performed if the application is correctly closed).
The application is responsible for recovering until unexpected crashes. In this recovery process, you should clean those unexpectedly closed SHM files (with Therefore, I am moving this issue to the Support section according to the Fast DDS CONTRIBUTING guidelines. |
Beta Was this translation helpful? Give feedback.
-
The sample program does not work if the program crashes, e.g. due to a memory error or is killed by the task manager. However, you can also place the code that I have included in the DLL at the beginning of the main function. Then the directory is cleaned before new connections are established. I find it more practical in the DLL, as all programs that use FastDDS receive this kind of error correction. |
Beta Was this translation helpful? Give feedback.
-
We have also meeting this issue based on ROS Humble on Ubuntu 22.04, when a long time running, the native node cannot read any data from topic through SHM, but in remote PC it can read successfully through UDP |
Beta Was this translation helpful? Give feedback.
-
Is there an already existing issue for this?
Expected behavior
After the process crashes and restarts, the data of the subscribed topic can be received correctly. However, after the process crashes and restarts, it shows that the subscription topic is successful, but no data is received.
Current behavior
However, after the process crashes and restarts, it shows that the subscription topic is successful, but no data is received.
Steps to reproduce
Using the shared memory communication method, there is one publisher and one subscriber. The mouse click on the subscriber console is stuck, and then the console is closed and the subscriber is restarted. You can see that the topic subscription is successful, but no data can be received. This problem does not exist when using UDP.
Fast DDS version/commit
2.14.0 WINDOWS binary installation package downloaded from the official website
Platform/Architecture
Windows 10 Visual Studio 2019
Transport layer
Shared Memory Transport (SHM)
Additional context
FASTDDS 2.14.0
XML configuration file
Relevant log output
Network traffic capture
No response
Beta Was this translation helpful? Give feedback.
All reactions