feat!: replace Pyro4/ZMQ transport with gRPC#569
Conversation
|
Wow. That's amazing. I'll give it a try and report back. |
mostly tested with mtda-kvm. I have not flashed on the Pi4 yet |
|
I just gave this a try on a rpi4 (but the results should apply to all targets):
|
bffd5b7 to
a6f64aa
Compare
Have not yet seen that one! |
Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
Condition.wait_for() re-acquires the underlying lock before returning whether the predicate matched or the timeout expired. The previous code only called rx_lock.release() on the True (match) branch, so a timeout left rx_lock permanently held. Every subsequent console operation (run, wait, head, tail, clear, dump) would then block forever, and the process_rx() reader thread would also stall, making recovery impossible. Wrap the wait_for() call in a try/finally so rx_lock is always released. Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
run() acquired rx_lock and then called rx_cond.wait_for() twice with no timeout argument, meaning an unresponsive DUT would hold rx_lock forever. Since process_rx() must also acquire rx_lock to deliver new data, no new notifications could ever arrive to satisfy the condition — a self-reinforcing deadlock with no escape. Add a timeout parameter (default 30 s) to both wait_for() calls and return None on expiry. Wrap the entire body in try/finally so rx_lock is released even if write() or any other call raises an exception. Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
Use 'or' instead of 'and' to skip non-STORAGE-WRITING events. With 'and', events like 'FOO WRITING' or 'STORAGE FOO' would pass the guard and reach app.imgname. Add a None guard to safely ignore events when no write operation is in progress. Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
imgname was set only inside storage_write/storage_update, so console_rx threads receiving STORAGE WRITING events in other contexts raised AttributeError. Initialize it to None in __init__ to guarantee the attribute always exists. Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
endswith() fails when the received data contains extra characters after the prompt (e.g. trailing spaces or CR). Use 'in' to match the prompt anywhere in the received line. Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
Here's a patch to cherry pick which solves this issue: From 1afe6e297f010ec06cad720c237c06a00897a7b3 Mon Sep 17 00:00:00 2001
From: Felix Moessbauer <felix.moessbauer@siemens.com>
Date: Fri, 20 Mar 2026 14:37:44 +0100
Subject: [PATCH 1/1] chore(mtda-isar): delete auto-generated pb files
These files must be re-generated for the target set of protobuf
dependencies (the ones in the image, not on the build system).
By deleting them prior to the build, we ensure that they are
regenerated.
Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
meta-isar/recipes-python/mtda/mtda_git.bb | 2 ++
1 file changed, 2 insertions(+)
diff --git a/meta-isar/recipes-python/mtda/mtda_git.bb b/meta-isar/recipes-python/mtda/mtda_git.bb
index dc0b4db2..31dd8644 100644
--- a/meta-isar/recipes-python/mtda/mtda_git.bb
+++ b/meta-isar/recipes-python/mtda/mtda_git.bb
@@ -62,6 +62,8 @@ do_gen_working_repo() {
for file in ${MTDA_FILES}; do
cp -a ${LAYERDIR_mtda}/../$file ${S}/
done
+ # delete auto-generated files as they need to be re-generated on building
+ find ${S} -name 'mtda_pb2*.py' -delete
rm -f ${S}/debian/source/format
}
do_gen_working_repo[cleandirs] += "${S}"
--
2.53.0update it still does not reliably solve the issue because the files are still copied in. Let me rework the patch. update 2 I updated the patch to make it more robust. Now it reliably works. |
input is now sent as binary data and I am now making sure that mtda-cli consumes all available data from stdin and transmits it as a single console_send RPC call to avoid fragmenting e.g. escape sequences |
I tested this again and can confirm that the escape sequences are no longer split (iow, it works as expected). Still, I need the patch from #569 (comment) as I run |
sure. could you |
usb_ids(), usb_add(), and command() called .splitlines() on the return value of _cmd() / cmd() without checking for None. _cmd() returns None when QEMU fails to start. Add the same guard that status() already applies. Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
These files must be re-generated for the target set of protobuf dependencies (the ones in the image, not on the build system). By deleting them prior to the build, we ensure that they are regenerated. Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
|
@fmoessbauer CI passed with your patch added. thanks! kindly approve the PR so we can merge it |
I will just give it another try on my RPI4 and then approve. |
fmoessbauer
left a comment
There was a problem hiding this comment.
I had to update cip to 185b390af944e095d803c82d434bd51e74c999fd to be able to build the rpi4-ebg variant. But after doing so the new version worked flawlessly.
Many thanks for implementing this. Having an gRPC interface makes things so much easier when running in a larger lab.
Replace the three-port Pyro4+ZMQ architecture with a single-port gRPC service (default port 5556):
BREAKING CHANGE: the console (5557) and data (5558) ports no longer exist; all traffic flows over the single control port (default 5556).
Closes: #524