-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tacd: error propagation instead of unwraping part 1 #43
base: main
Are you sure you want to change the base?
tacd: error propagation instead of unwraping part 1 #43
Conversation
e4495ac
to
c0391af
Compare
Instead of unwrapping Results and potentially panicking we replace it with proper error propagation that can be acted upon in the future e.g. return different exit codes for systemd or display error messages. Signed-off-by: Stefan Kerkmann <[email protected]>
c0391af
to
a0cb336
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
thanks for taking on the task to make tacd
less panicky!
I've added some comments. Mostly around Result
propagation inside of async_std
tasks and std
threads.
This got me thinking about how tasks in the tacd
should be monitored better in general.
|
||
let (mut src, _) = topic.clone().subscribe_unbounded(); | ||
|
||
spawn(async move { | ||
while let Some(ev) = src.next().await { | ||
dst.set_value((ev ^ inverted) as _).unwrap(); | ||
dst.set_value((ev ^ inverted) as _)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An error from the spawned task propagates nowhere and thus silently ignored.
In my opinion that's worse than crashing the tacd
.
We could build a wrapped async_std::task::spawn()
that works something like BrokerBuilder
in that you pass it around everywhere you want to spawn a task during init and it adds the spawned task to an internal list.
The JoinHandle
s could then be watched for returned errors in main()
, so we can break out of it if something happens.
The usage would be something like:
src/digital_io.rs
…
fn handle_line_wo(
bb: &mut BrokerBuilder,
watched: &mut WatchedTaskBuilder,
…
) -> Result<Arc<Topic<bool>>> {
…
watched.spawn(async move {
…
});
}
…
src/main.rs
…
#[async_std::main]
async fn main() -> Result<()> {
let mut watched = WatchedTaskBuilder::new();
…
let dig_io = DigitalIo::new(&mut bb, &mut watched, led.out_0.clone(), led.out_1.clone())?;
…
watched.run().await
}
That would also give us the opportunity to define tasks that should complete before marking the initialization as successful (i.e. when to notify systemd
that we are ready, so that the RAUC slot can be marked good by rauc-mark-good.service
) by e.g. spawning them via a special WatchedTaskBuilder::spawn_init()
or the likes.
I quite like the idea. Let me mock something up real quick …
pwr_line: &LineHandle, | ||
discharge_line: &LineHandle, | ||
fail_state: &AtomicU8, | ||
) { | ||
pwr_line.set_value(1 - PWR_LINE_ASSERTED).unwrap(); | ||
discharge_line.set_value(DISCHARGE_LINE_ASSERTED).unwrap(); | ||
) -> Result<()> { | ||
pwr_line.set_value(1 - PWR_LINE_ASSERTED)?; | ||
discharge_line.set_value(DISCHARGE_LINE_ASSERTED)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If setting the outputs fails the tacd
should exit with an error code as fast as possible, so the fail safes can kick in and disable the output for us.
I am not saying that the error-propagating code can not ensure that. I just wanted to write this down as kind of a mental note to check that it is does.
thread_res_tx | ||
.try_send(Ok((tick, request.clone(), state.clone()))) | ||
.unwrap(); | ||
thread_res_tx.try_send(Ok((tick, request.clone(), state.clone())))?; | ||
|
||
(tick_weak, request, state) | ||
} | ||
Err(e) => { | ||
thread_res_tx.try_send(Err(e)).unwrap(); | ||
thread_res_tx.try_send(Err(e))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Result
returned from the spawned thread (this time an actual thread and not an async-std
task) does not go anywhere (which is part of the reason we have the thread_res_tx
solution in the first place).
In this case this would mean the tacd
would wait forever for a message on thread_res_rx
until the systemd watchdog would kill it.
The .unwrap()
based implementation would have at least printed an error message as a clue to what went wrong.
PS: Now that I've had a second look I think dropping thread_res_tx
would generate an event on thread_res_rx
because it makes the Stream end, so we would at least get a "didn't receive thread result" message.
If you really want to get rid of this .unwrap()
(for esthetic reasons - this .unwrap()
should never hit) you could use async_std::task::spawn_blocking()
instead of thread::spawn
to spawn a thread for which you can receive the return value in an async fashion and use e.g. futures_lite::future::race()
to check both the thread_res_rx
channel, as well as the threads JoinHandle
for an event. In this case we would only communicate a sucess via the channel, meaning it does no longer have to carry a Result
.
That could also simplify the error handling in the thread a bit as we could ?
in other places in the initialization and maybe also in the steady state.
@@ -442,7 +441,7 @@ impl DutPwrThread { | |||
&pwr_line, | |||
&discharge_line, | |||
&state, | |||
); | |||
)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once again the Results
from the thread do not go anywhere and the Results are just silently dropped.
If the WatchedTaskBuilder
(the name is still up for discussion) idea manifests we could just add the thread to the tasks to watch.
discharge_line | ||
.set_value(1 - DISCHARGE_LINE_ASSERTED) | ||
.unwrap(); | ||
pwr_line.set_value(PWR_LINE_ASSERTED).unwrap(); | ||
discharge_line.set_value(1 - DISCHARGE_LINE_ASSERTED)?; | ||
pwr_line.set_value(PWR_LINE_ASSERTED)?; | ||
state.store(OutputState::On as u8, Ordering::Relaxed); | ||
} | ||
OutputRequest::Off => { | ||
discharge_line.set_value(DISCHARGE_LINE_ASSERTED).unwrap(); | ||
pwr_line.set_value(1 - PWR_LINE_ASSERTED).unwrap(); | ||
discharge_line.set_value(DISCHARGE_LINE_ASSERTED)?; | ||
pwr_line.set_value(1 - PWR_LINE_ASSERTED)?; | ||
state.store(OutputState::Off as u8, Ordering::Relaxed); | ||
} | ||
OutputRequest::OffFloating => { | ||
discharge_line | ||
.set_value(1 - DISCHARGE_LINE_ASSERTED) | ||
.unwrap(); | ||
pwr_line.set_value(1 - PWR_LINE_ASSERTED).unwrap(); | ||
discharge_line.set_value(1 - DISCHARGE_LINE_ASSERTED)?; | ||
pwr_line.set_value(1 - PWR_LINE_ASSERTED)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
let max = led.max_brightness().unwrap(); | ||
let max = led.max_brightness()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once again an error that propagates nowhere.
}); | ||
} | ||
|
||
topic | ||
Ok(topic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handle_color()
itself is still infallible. Errors from the spawned task do not propagate out.
No need for the -> Result<…>
and Ok(topic)
.
SystemTime::now().checked_sub(age).unwrap() | ||
SystemTime::now() | ||
.checked_sub(age) | ||
.with_context(|| "couldn't get system time") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting the system time is infallible, what's actually checkd here is if SystemTime::now() - age
can be expressed as a SystemTime
(SystemTime::checked_sub()
).
I don't quite know why I did not just use a SystemTime::now() - age
here as SystemTime
actually implements Sub
and this failing would mean something really strange is happening.
Maybe just do that to get rid of this .unwrap()
.
match { | ||
|| { | ||
let time = self.in_system_time()?; | ||
let timestamp = time.duration_since(SystemTime::UNIX_EPOCH)?; | ||
let js_timestamp = 1000.0 * timestamp.as_secs_f64(); | ||
anyhow::Ok(js_timestamp) | ||
} | ||
}() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that took me a bit to figure out. We should find a solution my brain can parse in a matter of seconds, not minutes.
How about adding a fn as_js_timestamp(&self) -> Result<f64>
method to Timestamp
and using it here?.
let _js_timestamp = f64::deserialize(deserializer)?; | ||
// We need both Serialize and Deserialize for Topics, even when they | ||
// are never deserialized in practice like Timestamps. | ||
unimplemented!(); | ||
use serde::de::Error; | ||
Err(Error::custom("unused implementation")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me that I always wanted to change the BrokerBuilder
so that you do not have to implement Serialize
for write-only (e.g. they can only be written from the HTTP API but never read) Topic
s and Deserialize
for read-only Topic
s, but did not get around to it yet.
Until then this is the nicer workaround.
Hi, I've hacked together a prototype that keeps track of running tasks and should propagate errors down the line. Did not do any testing yet though and as of now the cargo tests do not compile, but I think it's a good start. have a look at my |
I've now cleaned up my work in |
Instead of unwrapping Results and potentially panicking we replace it with proper error propagation that can be acted upon in the future e.g. return different exit codes for systemd or display error messages.