blockdev: Fix loopback device resource leak on signal interruption #1402

gursewak1997 · 2025-07-10T21:49:04Z

This commit implements issue #799 by creating a signal-safe cleanup helper for loopback devices to prevent resource leaks when bootc install --via-loopback is interrupted by signals like SIGINT (Ctrl-C).

The solution uses an 'out-of-process drop' helper that:

Forks a cleanup helper process when creating a loopback device
Uses PR_SET_PDEATHSIG to detect when the parent process dies
Masks most signals to avoid being killed accidentally
Automatically cleans up leaked loopback devices if the parent dies
Gracefully terminates when the parent performs normal cleanup

This prevents the common issue where interrupting bootc install --via-loopback with Ctrl-C would leave /dev/loopN devices allocated on the system.

Fixes: #799

gemini-code-assist

Code Review

This pull request introduces a signal-safe cleanup mechanism for loopback devices to prevent resource leaks on signal interruption. It uses an out-of-process helper to clean up leaked loopback devices. I've added comments to enhance error logging for better debugging.

gemini-code-assist · 2025-07-10T21:51:34Z

blockdev/src/blockdev.rs

+            if libc::pthread_sigmask(libc::SIG_SETMASK, &sigset, std::ptr::null_mut()) != 0 {
+                std::process::exit(1);
+            }


The pthread_sigmask function can fail, but the error is not being handled beyond exiting the process. It would be better to log the error message using tracing::error! before exiting, to aid in debugging.

if libc::pthread_sigmask(libc::SIG_SETMASK, &sigset, std::ptr::null_mut()) != 0 { let err = std::io::Error::last_os_error(); tracing::error!("pthread_sigmask failed: {}", err); std::process::exit(1); }

gemini-code-assist · 2025-07-10T21:51:34Z

blockdev/src/blockdev.rs

+        let result = unsafe { 
+            libc::sigwaitinfo(&sigset, &mut siginfo)
+        };


The sigwaitinfo function can fail. It would be better to log the error message using tracing::error! before exiting, to aid in debugging.

let result = unsafe { let result = libc::sigwaitinfo(&sigset, &mut siginfo); if result == -1 { let err = std::io::Error::last_os_error(); tracing::error!("sigwaitinfo failed: {}", err); std::process::exit(1); } result };

gemini-code-assist · 2025-07-10T21:51:34Z

blockdev/src/blockdev.rs

+            unsafe {
+                libc::kill(cleanup_handle.helper_pid as i32, libc::SIGTERM);
+            }


The kill function can fail. It would be better to log the error message using tracing::warn! if kill fails, to aid in debugging.

unsafe { if libc::kill(cleanup_handle.helper_pid as i32, libc::SIGTERM) != 0 { let err = std::io::Error::last_os_error(); tracing::warn!("kill failed: {}", err); } }

gemini-code-assist · 2025-07-10T21:51:34Z

blockdev/src/blockdev.rs

+            // Wait for it to exit (non-blocking)
+            unsafe {
+                let mut status = 0;
+                libc::waitpid(cleanup_handle.helper_pid as i32, &mut status, libc::WNOHANG);
+            }


The waitpid function can fail. It would be better to log the error message using tracing::warn! if waitpid fails, to aid in debugging.

unsafe { let mut status = 0; if libc::waitpid(cleanup_handle.helper_pid as i32, &mut status, libc::WNOHANG) == -1 { let err = std::io::Error::last_os_error(); tracing::warn!("waitpid failed: {}", err); } }

cgwalters · 2025-07-11T13:04:06Z

Thanks for working on this! While it will be a bit more awkward can you try doing it this way #799 (comment) - that should 100% avoid all the unsafe code.

Basically instead of a raw fork() (source of basically all the unsafe in general) we fork+exec our own binary /proc/self/exe. Look at e.g. reexec_with_guardenv.

Add fork+exec based cleanup helper to prevent loopback device leaks when bootc install --via-loopback is interrupted by signals like SIGINT. - Add loopback-cleanup-helper CLI subcommand - Implement run_loopback_cleanup_helper() with PR_SET_PDEATHSIG - Update LoopbackDevice to spawn cleanup helper process - Add tests for spawn mechanism

cgwalters

Thanks! Looking closer

cgwalters · 2025-07-11T18:01:53Z

blockdev/src/blockdev.rs

+        anyhow::bail!("This function should only be called as a cleanup helper");
+    }
+
+    // Close stdin, stdout, stderr and redirect to /dev/null


This is better done in the parent process setup above

cgwalters · 2025-07-11T18:02:29Z

blockdev/src/blockdev.rs

+            .context("Failed to read /proc/self/exe")?;
+
+        // Create the helper process using exec
+        let mut cmd = Command::new(self_exe);


Set up std{in,out,err} here via https://doc.rust-lang.org/std/process/struct.Stdio.html#method.null

cgwalters · 2025-07-11T18:03:17Z

blockdev/src/blockdev.rs

+    }
+
+    // Set up death signal notification - we want to be notified when parent dies
+    unsafe {


There's a safe version of this in https://docs.rs/rustix/latest/rustix/process/fn.set_parent_process_death_signal.html

cgwalters · 2025-07-11T18:04:57Z

blockdev/src/blockdev.rs

+
+    // Set up death signal notification - we want to be notified when parent dies
+    unsafe {
+        if libc::prctl(libc::PR_SET_PDEATHSIG, libc::SIGUSR1) != 0 {


It seems cleaner to me to use SIGTERM and react to that

cgwalters · 2025-07-11T18:05:26Z

blockdev/src/blockdev.rs

+        }
+    }
+
+    // Mask most signals to avoid being killed accidentally


So https://docs.rs/tokio/latest/tokio/signal/index.html is one way to handle this in a safe way (will require making the function async)

I think the tokio API will replace about 50 lines of unsafe code with 5 lines of safe code.

cgwalters · 2025-07-11T18:07:25Z

blockdev/src/blockdev.rs

+
+            match status {
+                Ok(exit_status) if exit_status.success() => {
+                    // Write to stderr since we closed stdout


Mmm my vote here is probably to not inherit stderr at all; if we write to stderr then we have the possibility to intermix the child process writes with the parent's.

One option is to explicitly log to the systemd journal.

I guess speaking of systemd...a whole possibility I hadn't considered until just now is that we fork off via systemd-run. That would have some nice advantages but will be trickier to get right the lifecycle binding, so let's leave that for the future.

(There's this whole giant topic in bootc overall defaulting to running via systemd in some cases, which would similarly have a lot of advantages but be a big nontrivial change)

cgwalters · 2025-07-11T18:08:25Z

blockdev/src/blockdev.rs

-            20961247
-        );
-        Ok(())
+        let data = fs::read_to_string("tests/fixtures/lsblk.json").unwrap();


This change looks unrelated? I mean it's probably fine to do but let's break it into a separate commit

gemini-code-assist bot reviewed Jul 10, 2025

View reviewed changes

gursewak1997 force-pushed the bootc-799 branch 6 times, most recently from 70c2b77 to a4ab303 Compare July 11, 2025 05:59

gursewak1997 force-pushed the bootc-799 branch from a4ab303 to d45a8f2 Compare July 11, 2025 16:46

cgwalters requested changes Jul 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

blockdev: Fix loopback device resource leak on signal interruption #1402

blockdev: Fix loopback device resource leak on signal interruption #1402

Uh oh!

gursewak1997 commented Jul 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jul 10, 2025

Uh oh!

gemini-code-assist bot Jul 10, 2025

Uh oh!

gemini-code-assist bot Jul 10, 2025

Uh oh!

gemini-code-assist bot Jul 10, 2025

Uh oh!

cgwalters commented Jul 11, 2025

Uh oh!

cgwalters left a comment

Uh oh!

cgwalters Jul 11, 2025

Uh oh!

cgwalters Jul 11, 2025

Uh oh!

cgwalters Jul 11, 2025

Uh oh!

cgwalters Jul 11, 2025

Uh oh!

cgwalters Jul 11, 2025

Uh oh!

cgwalters Jul 11, 2025

Uh oh!

cgwalters Jul 11, 2025

Uh oh!

Uh oh!

blockdev: Fix loopback device resource leak on signal interruption #1402

Are you sure you want to change the base?

blockdev: Fix loopback device resource leak on signal interruption #1402

Uh oh!

Conversation

gursewak1997 commented Jul 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters commented Jul 11, 2025

Uh oh!

cgwalters left a comment

Choose a reason for hiding this comment

Uh oh!

cgwalters Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!