Skip to content

7911: feat(net): add Unix socket ancillary data API for fd passing (SCM_RIGHTS)#63

Open
martin-augment wants to merge 1 commit intomasterfrom
pr-7911-2026-02-16-08-26-39
Open

7911: feat(net): add Unix socket ancillary data API for fd passing (SCM_RIGHTS)#63
martin-augment wants to merge 1 commit intomasterfrom
pr-7911-2026-02-16-08-26-39

Conversation

@martin-augment
Copy link
Owner

7911: To review by AI

…HTS)

Add send_vectored_with_ancillary and recv_vectored_with_ancillary methods
to UnixStream and UnixDatagram, enabling file descriptor passing over
Unix sockets via sendmsg/recvmsg with SCM_RIGHTS control messages.

New types: SocketAncillary, AncillaryData, ScmRights, Messages.
Platform support: MSG_CMSG_CLOEXEC on Linux/FreeBSD, fcntl fallback on macOS.

Closes tokio-rs#2975

fix: resolve CI warnings - add Debug impls, remove unused imports

style: apply rustfmt formatting

fix: clippy warnings
@gemini-code-assist
Copy link

Summary of Changes

Hello @martin-augment, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Tokio's Unix socket capabilities by introducing a new API for sending and receiving ancillary data, primarily enabling the passing of file descriptors between processes. This feature allows for more robust and flexible inter-process communication by facilitating the transfer of open file handles, improving resource sharing and system design patterns.

Highlights

  • New Ancillary Data API: Introduced a comprehensive API for handling Unix socket ancillary data, specifically for passing file descriptors (SCM_RIGHTS) between processes. This includes SocketAncillary for managing the data buffer, AncillaryData for parsing received messages, and iterators for file descriptors and control messages.
  • Vectored I/O with Ancillary Data: Added new send_vectored_with_ancillary and recv_vectored_with_ancillary methods (both blocking and non-blocking variants) to UnixDatagram and UnixStream. These methods allow sending and receiving regular data alongside ancillary data, such as file descriptors.
  • Platform-Specific CLOEXEC Handling: Implemented logic to automatically set the FD_CLOEXEC flag on received file descriptors on platforms that do not support MSG_CMSG_CLOEXEC (e.g., macOS), ensuring proper resource management.
  • Internal cmsg Module: Created a new internal cmsg module to encapsulate low-level sendmsg and recvmsg calls, providing a cleaner interface for ancillary data operations.
  • Extensive Test Coverage: Added new test cases for both Unix datagram and stream sockets to verify the correct functionality of sending and receiving single and multiple file descriptors, including non-blocking scenarios.
Changelog
  • tokio/src/net/unix/ancillary.rs
    • Added new module defining SocketAncillary for managing ancillary data, AncillaryError, AncillaryData enum, ScmRights iterator for file descriptors, and Messages iterator for control messages.
  • tokio/src/net/unix/cmsg.rs
    • Added new module containing RecvMsgResult struct and helper functions sendmsg_vectored and recvmsg_vectored for handling vectored I/O with ancillary data, including platform-specific CLOEXEC handling.
  • tokio/src/net/unix/datagram/socket.rs
    • Added methods try_send_vectored_with_ancillary, send_vectored_with_ancillary, try_recv_vectored_with_ancillary, and recv_vectored_with_ancillary to UnixDatagram for sending and receiving ancillary data.
  • tokio/src/net/unix/mod.rs
    • Exported new public types AncillaryData, AncillaryError, Messages, ScmRights, SocketAncillary and made cmsg module public within the crate.
  • tokio/src/net/unix/stream.rs
    • Added methods try_send_vectored_with_ancillary, send_vectored_with_ancillary, try_recv_vectored_with_ancillary, and recv_vectored_with_ancillary to UnixStream for sending and receiving ancillary data.
  • tokio/tests/uds_datagram.rs
    • Added a test case send_recv_fd_over_datagram to verify file descriptor passing functionality over Unix datagram sockets.
  • tokio/tests/uds_stream.rs
    • Added test cases send_recv_fd_over_stream, send_recv_multiple_fds_over_stream, and try_send_recv_fd_over_stream to validate file descriptor passing over Unix stream sockets.
Activity
  • The pull request description indicates that it is 'To review by AI', suggesting no prior human review activity.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 16, 2026

Walkthrough

This change introduces Unix socket ancillary data support to Tokio, enabling file descriptor passing via SCM_RIGHTS. A new ancillary module provides SocketAncillary buffer management, AncillaryData parsing, ScmRights iteration, and error handling. An internal cmsg module offers low-level sendmsg_vectored and recvmsg_vectored operations with cross-platform CLOEXEC handling. Four new public methods are added to both UnixStream and UnixDatagram: try_send_vectored_with_ancillary, send_vectored_with_ancillary, try_recv_vectored_with_ancillary, and recv_vectored_with_ancillary. Tests validate file descriptor transfer over both stream and datagram sockets.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pr-7911-2026-02-16-08-26-39

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link

claude bot commented Feb 16, 2026

PR Review: Unix Socket Ancillary Data API for FD Passing (SCM_RIGHTS)

This PR adds support for passing file descriptors over Unix sockets using SCM_RIGHTS. The implementation is well-structured and includes comprehensive tests. Here's my detailed review:

✅ Strengths

  1. Good API Design: The SocketAncillary abstraction is clean and follows Rust conventions
  2. Comprehensive Testing: Excellent test coverage for both streams and datagrams, including multiple FDs
  3. Platform Awareness: Proper handling of MSG_CMSG_CLOEXEC availability across platforms
  4. Documentation: Good doc comments with examples

🔍 Code Quality Issues

Critical: Memory Safety Concerns

1. Potential undefined behavior in Messages::new() (ancillary.rs:107-119)

The msg structure is created with uninitialized fields and stored in the iterator:

let mut msg: libc::msghdr = unsafe { mem::zeroed() };
msg.msg_control = buffer.as_ptr() as *mut libc::c_void;
msg.msg_controllen = length as _;

The msg is stored and later used in Iterator::next(). However, the msghdr has other fields (msg_name, msg_namelen, msg_iov, msg_iovlen, msg_flags) that are zero-initialized but may need specific handling.

Recommendation: Add a comment explaining why mem::zeroed() is safe here, or verify that all platforms handle zeroed msghdr correctly for CMSG_* macros.

2. Unchecked pointer arithmetic in ScmRights::next() (ancillary.rs:74-83)

let mut fd_bytes = [0u8; mem::size_of::<RawFd>()];
fd_bytes.copy_from_slice(&self.data[self.offset..self.offset + fd_size]);
Some(RawFd::from_ne_bytes(fd_bytes))

The code assumes RawFd can be safely reconstructed from raw bytes using from_ne_bytes(). However, RawFd is a type alias for c_int and doesn't have a from_ne_bytes() method in stable Rust.

Issue: This code won't compile. Did you mean to use i32::from_ne_bytes(fd_bytes)?

3. Buffer size validation in add_fds() (ancillary.rs:227-283)

The complex logic for finding the next cmsg slot is error-prone. Consider these edge cases:

  • What if CMSG_SPACE returns a value larger than expected?
  • The walking logic assumes CMSG_NXTHDR behaves consistently across platforms

Recommendation: Add assertions or debug checks to verify buffer boundaries.

High Priority: Resource Leak Prevention

4. Missing RAII for received file descriptors (tests)

In tests like send_recv_multiple_fds_over_stream (uds_stream.rs:1079-1081):

for fd in &received_fds {
    unsafe { libc::close(*fd) };
}

Raw close() calls are error-prone and don't provide panic safety. If the test panics before this cleanup, FDs leak.

Recommendation: Wrap received FDs in OwnedFd immediately:

use std::os::unix::io::{FromRawFd, OwnedFd};
let owned_fds: Vec<OwnedFd> = received_fds.into_iter()
    .map(|fd| unsafe { OwnedFd::from_raw_fd(fd) })
    .collect();
// OwnedFd handles cleanup automatically

5. FD exhaustion in tests

Tests create temporary files but don't always ensure cleanup on panic. Consider using RAII wrappers consistently.

Medium Priority: Correctness Issues

6. Truncation flag handling (ancillary.rs:289-291)

pub fn is_truncated(&self) -> bool {
    self.truncated
}

When truncation occurs, received FDs that don't fit are lost forever. The API doesn't warn users about this strongly enough.

Recommendation:

  • Add a #[must_use] attribute to is_truncated()
  • Document that truncated FDs are closed by the kernel
  • Consider returning a Result from receive methods if truncation occurs

7. Recursive call in Messages::next() (ancillary.rs:144-146)

} else {
    // Skip unknown control message types, try next
    self.next()
}

This recursion could cause stack overflow if there are many unknown message types (unlikely but theoretically possible on malicious input).

Recommendation: Use a loop instead:

fn next(&mut self) -> Option<Self::Item> {
    loop {
        if self.current.is_null() {
            return None;
        }
        // ... parse and return if SCM_RIGHTS, otherwise continue
    }
}

Low Priority: API Improvements

8. Missing #[must_use] attributes

Methods like messages(), is_empty(), len(), is_truncated() should have #[must_use].

9. Integer truncation warnings

Multiple uses of as _ suppress type checking:

msg.msg_controllen = length as _;

Consider explicit casts with comments explaining why they're safe.

10. Platform-specific code duplication

The set_cloexec() conditional logic is duplicated in multiple places. Consider a helper macro or function.

🔒 Security Considerations

  1. FD_CLOEXEC handling: ✅ Properly implemented with platform-specific fallback
  2. Buffer overflow protection: ⚠️ Relies on libc::CMSG_* macros being correct
  3. Integer overflow: Consider checking num_fds * mem::size_of::<RawFd>() for overflow in buffer_size_for_rights()

🚀 Performance Considerations

  1. The implementation is efficient and uses vectored I/O properly
  2. No unnecessary allocations in hot paths ✅
  3. Consider documenting the cost of the fcntl fallback on macOS

📋 Test Coverage

Excellent coverage overall, but consider adding:

  • Test for buffer too small scenario
  • Test for truncation behavior
  • Test for receiving unknown control message types
  • Test for maximum number of FDs (platform limits)
  • Negative test: trying to receive on non-connected datagram

🎯 Recommendations Summary

Must fix before merge:

  1. Fix the RawFd::from_ne_bytes() compilation error
  2. Add safety comments for all unsafe blocks
  3. Fix recursive call in Messages::next()

Should fix:
4. Use OwnedFd in tests instead of raw close()
5. Add #[must_use] to appropriate methods
6. Consider stronger error handling for truncation

Nice to have:
7. Add more edge case tests
8. Document platform differences more clearly
9. Add integer overflow checks

Overall Assessment

This is a solid implementation of a complex low-level feature. The core logic is sound, but there are some safety and correctness issues that should be addressed. The test coverage is excellent. With the fixes above, this would be production-ready.

Recommendation: Request changes for the compilation error and safety documentation, then approve after fixes.


Great work on tackling this challenging feature! The ancillary data API is well-designed and will be very useful for Unix socket users. 🎉

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new API for sending and receiving file descriptors over Unix sockets using ancillary data (SCM_RIGHTS). The implementation is well-structured and includes comprehensive tests. However, the implementation contains critical memory safety vulnerabilities. Specifically, integer truncation in add_fds can lead to heap buffer overflows, and missing validation of cmsg_len in the message iterator can result in out-of-bounds reads or panics. Additionally, there's a potential stack overflow issue in the ancillary message iterator due to recursive calls for skipping unknown message types. These issues must be addressed before the PR is merged.

Comment on lines +223 to +272
let space = unsafe { libc::CMSG_SPACE(fd_bytes_len as libc::c_uint) as usize };

if self.length + space > self.buffer.len() {
return Err(AncillaryError);
}

unsafe {
// Build a temporary msghdr pointing at our buffer to use CMSG macros
let mut msg: libc::msghdr = mem::zeroed();
msg.msg_control = self.buffer.as_mut_ptr() as *mut libc::c_void;
msg.msg_controllen = (self.length + space) as _;

// Find the cmsg slot. If length is 0, use CMSG_FIRSTHDR.
// Otherwise walk to find the next available slot.
let cmsg = if self.length == 0 {
libc::CMSG_FIRSTHDR(&msg)
} else {
// Point msg_controllen at current length to walk existing headers
let mut walk_msg: libc::msghdr = mem::zeroed();
walk_msg.msg_control = self.buffer.as_mut_ptr() as *mut libc::c_void;
walk_msg.msg_controllen = self.length as _;

let mut cur = libc::CMSG_FIRSTHDR(&walk_msg);
while !cur.is_null() {
let next = libc::CMSG_NXTHDR(&walk_msg, cur);
if next.is_null() {
// cur is the last header; the next slot is after it
break;
}
cur = next;
}
// Recalculate with expanded controllen
msg.msg_controllen = (self.length + space) as _;
if cur.is_null() {
libc::CMSG_FIRSTHDR(&msg)
} else {
libc::CMSG_NXTHDR(&msg, cur)
}
};

if cmsg.is_null() {
return Err(AncillaryError);
}

(*cmsg).cmsg_level = libc::SOL_SOCKET;
(*cmsg).cmsg_type = libc::SCM_RIGHTS;
(*cmsg).cmsg_len = libc::CMSG_LEN(fd_bytes_len as libc::c_uint) as _;

let data_ptr = libc::CMSG_DATA(cmsg);
std::ptr::copy_nonoverlapping(fds.as_ptr() as *const u8, data_ptr, fd_bytes_len);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

Critical: Heap Buffer Overflow in SocketAncillary::add_fds due to Integer Truncation

The add_fds method is vulnerable to a heap buffer overflow. The length of the file descriptors slice (fd_bytes_len) is a usize, but it is explicitly cast to libc::c_uint (typically 32-bit) when passed to libc::CMSG_SPACE and libc::CMSG_LEN.

If a caller provides a large number of file descriptors (e.g., such that fd_bytes_len exceeds u32::MAX), the truncated value will be used to calculate the required space and to set the cmsg_len field. This can result in the self.length + space > self.buffer.len() check passing even when the buffer is insufficient. Subsequently, std::ptr::copy_nonoverlapping is called with the original, non-truncated fd_bytes_len, leading to a massive out-of-bounds write into the heap.

This same issue also affects buffer_size_for_rights at line 215.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:security; feedback: The Gemini AI reviewer is correct! A bad actor could send data with too big value for fd_bytes_len that could cause integer overflow calculation and undefined behavior of any following usage of the result.

Comment on lines +134 to +135
cmsg.cmsg_len as usize - (data_ptr as usize - cmsg as *const _ as usize);
let data = std::slice::from_raw_parts(data_ptr, data_len);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The Messages::next iterator contains a critical out-of-bounds read vulnerability. Missing validation of the cmsg_len field can lead to integer underflow if cmsg_len is smaller than the header size, resulting in an extremely large data_len. This can cause out-of-bounds reads or panics when std::slice::from_raw_parts is used, especially with truncated or malicious control messages. Additionally, the recursive call in next() to skip unknown control message types can lead to a stack overflow with many unknown messages, posing a denial-of-service risk. An iterative approach is recommended to prevent this.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:security; feedback: The Gemini AI reviewer is correct! A bad actor could send data with intentionally smaller value for cmsg_len than the header offset and this will cause underflow integer calculation. Prevents a security vulnerability when processing unchecked input.

@augmentcode
Copy link

augmentcode bot commented Feb 16, 2026

🤖 Augment PR Summary

Summary: Adds a Unix-socket ancillary data API to Tokio to enable file descriptor passing via SCM_RIGHTS.

Changes:

  • Introduced tokio::net::unix::SocketAncillary plus parsing types (Messages, AncillaryData, ScmRights) for received control messages.
  • Added a new internal net::unix::cmsg module wrapping non-blocking sendmsg/recvmsg (including MSG_CMSG_CLOEXEC usage and a fcntl fallback).
  • Extended UnixStream and UnixDatagram with try_* and async *_vectored_with_ancillary send/recv methods.
  • Added tests covering FD passing over UDS streams (single/multiple FDs) and datagrams.

Technical Notes: Receive paths propagate ancillary truncation via SocketAncillary::is_truncated(), and set close-on-exec on received FDs where MSG_CMSG_CLOEXEC is unavailable.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

return Err(AncillaryError);
}

(*cmsg).cmsg_level = libc::SOL_SOCKET;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dereferencing cmsg assumes the backing &mut [u8] is properly aligned for libc::cmsghdr, but a u8 buffer doesn’t guarantee that alignment, so this can be UB on some targets. Consider documenting/enforcing an aligned buffer requirement (the docs currently show let mut buf = [0u8; 128];).

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:bug; feedback: The Augment AI reviewer is correct! Using an attribute like #[repr(align(8))] would document the expected alignment of the msg_control struct and make the callers follow it or face a build issues.

if cmsg.cmsg_level == libc::SOL_SOCKET && cmsg.cmsg_type == libc::SCM_RIGHTS {
let data_ptr = libc::CMSG_DATA(cmsg as *const _ as *mut _);
#[allow(clippy::unnecessary_cast)]
let data_len =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data_len is computed via a subtraction that can underflow if cmsg_len is smaller than the header/data offset, which would make from_raw_parts create an out-of-bounds slice. Even if the kernel typically validates control messages, it may be worth defensively ensuring this computation can’t underflow on malformed/truncated ancillary data.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:security; feedback: The Augment AI reviewer is correct! A bad actor could send data with intentionally smaller value for cmsg_len than the header offset and this will cause underflow integer calculation. Prevents a security vulnerability when processing unchecked input.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tokio/src/net/unix/cmsg.rs`:
- Around line 40-68: The ancillary_buf passed into recvmsg_vectored must be
aligned for struct cmsghdr but is currently typed as &mut [u8] without
documentation or helper for proper alignment; update the public API by adding a
clear doc comment on SocketAncillary::new() stating that the backing buffer must
be aligned to the alignment of libc::cmsghdr (e.g., 8 bytes on common ABIs) and
that using a plain [u8; N] may be misaligned, and add a recommended safe helper
type (e.g., a small #[repr(align(...))] wrapper) or factory function in the docs
so callers can construct properly-aligned ancillary buffers; reference
recvmsg_vectored, ancillary_buf, and SocketAncillary::new() in the change so
users and reviewers can find the impacted locations.
🧹 Nitpick comments (4)
tokio/tests/uds_datagram.rs (1)

427-487: Test covers the happy path well.

Minor note: unlike send_recv_fd_over_stream in uds_stream.rs (line 491), this test doesn't assert !recv_ancillary.is_truncated(). Consider adding that check for consistency and to catch buffer sizing issues early.

Optional: add truncation assertion
     assert_eq!(n, 5);
     assert_eq!(&recv_buf[..5], b"dgram");
+    assert!(!recv_ancillary.is_truncated());
 
     let mut received_fds: Vec<RawFd> = Vec::new();
tokio/src/net/unix/datagram/socket.rs (1)

695-742: Document partial-write behavior for ancillary data on stream sockets.

With SOCK_STREAM, sendmsg can return a short write (fewer bytes than the total iov length). The kernel attaches the ancillary data to the first byte sent. If a caller sees a partial write and retries send_vectored_with_ancillary with the remaining data without clearing the ancillary buffer, the FDs will be sent twice.

This is inherent to the sendmsg API, but it would be helpful to document in the method's doc comment that callers should call ancillary.clear() after a successful send before sending any remaining data. This applies to UnixStream as well (see tokio/src/net/unix/stream.rs lines 796-812).

tokio/src/net/unix/ancillary.rs (2)

116-143: Recursive self.next() for unknown message types could overflow the stack.

Line 139 calls self.next() recursively to skip unrecognized cmsg types. While the number of messages is bounded by the buffer size, a loop is safer and clearer.

♻️ Refactor to iterative approach
     fn next(&mut self) -> Option<Self::Item> {
-        if self.current.is_null() {
-            return None;
-        }
-
-        unsafe {
-            let cmsg = &*self.current;
-
-            // Advance to next header for the next call
-            self.current = libc::CMSG_NXTHDR(&self.msg, self.current);
-
-            if cmsg.cmsg_level == libc::SOL_SOCKET && cmsg.cmsg_type == libc::SCM_RIGHTS {
-                let data_ptr = libc::CMSG_DATA(cmsg as *const _ as *mut _);
-                #[allow(clippy::unnecessary_cast)]
-                let data_len =
-                    cmsg.cmsg_len as usize - (data_ptr as usize - cmsg as *const _ as usize);
-                let data = std::slice::from_raw_parts(data_ptr, data_len);
-                Some(AncillaryData::ScmRights(ScmRights::new(data)))
-            } else {
-                // Skip unknown control message types, try next
-                self.next()
-            }
-        }
+        loop {
+            if self.current.is_null() {
+                return None;
+            }
+
+            unsafe {
+                let cmsg = &*self.current;
+
+                // Advance to next header for the next call
+                self.current = libc::CMSG_NXTHDR(&self.msg, self.current);
+
+                if cmsg.cmsg_level == libc::SOL_SOCKET && cmsg.cmsg_type == libc::SCM_RIGHTS {
+                    let data_ptr = libc::CMSG_DATA(cmsg as *const _ as *mut _);
+                    #[allow(clippy::unnecessary_cast)]
+                    let data_len =
+                        cmsg.cmsg_len as usize - (data_ptr as usize - cmsg as *const _ as usize);
+                    let data = std::slice::from_raw_parts(data_ptr, data_len);
+                    return Some(AncillaryData::ScmRights(ScmRights::new(data)));
+                }
+                // Skip unknown control message types, continue loop
+            }
+        }
     }

287-290: messages() passes redundant length.

Messages::new receives both &self.buffer[..self.length] (a slice already truncated to self.length) and self.length as a separate parameter. Inside Messages::new, length is used to set msg_controllen, but buffer.len() would give the same value. Consider simplifying to only pass the slice, using buffer.len() internally.

Comment on lines +40 to +68
/// Perform a non-blocking `recvmsg` with vectored data and ancillary data.
pub(crate) fn recvmsg_vectored(
fd: RawFd,
iov: &mut [io::IoSliceMut<'_>],
ancillary_buf: &mut [u8],
) -> io::Result<RecvMsgResult> {
unsafe {
let mut msg: libc::msghdr = std::mem::zeroed();
msg.msg_iov = iov.as_mut_ptr() as *mut libc::iovec;
msg.msg_iovlen = iov.len() as _;
msg.msg_control = ancillary_buf.as_mut_ptr() as *mut libc::c_void;
msg.msg_controllen = ancillary_buf.len() as _;

let flags = recv_flags();

let ret = libc::recvmsg(fd, &mut msg, flags);
if ret < 0 {
Err(io::Error::last_os_error())
} else {
#[allow(clippy::unnecessary_cast)]
let ancillary_len = msg.msg_controllen as usize;
Ok(RecvMsgResult {
bytes_read: ret as usize,
ancillary_len,
truncated: (msg.msg_flags & libc::MSG_CTRUNC) != 0,
})
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, examine the full cmsg.rs file to understand the context
cd tokio && wc -l src/net/unix/cmsg.rs

Repository: martin-augment/tokio

Length of output: 88


🏁 Script executed:

# Read the cmsg.rs file
cat -n src/net/unix/cmsg.rs | head -150

Repository: martin-augment/tokio

Length of output: 116


🏁 Script executed:

# Search for where recvmsg_vectored is called
rg "recvmsg_vectored" --type rs -B 2 -A 2

Repository: martin-augment/tokio

Length of output: 90


🏁 Script executed:

# Look for SocketAncillary in tokio to see how ancillary buffers are used
rg "SocketAncillary" --type rs -B 2 -A 2

Repository: martin-augment/tokio

Length of output: 90


🏁 Script executed:

# Check if there's documentation about alignment in unix socket code
rg "align" src/net/unix/ --type rs -i

Repository: martin-augment/tokio

Length of output: 90


🏁 Script executed:

# Read the cmsg.rs file
cat -n tokio/src/net/unix/cmsg.rs

Repository: martin-augment/tokio

Length of output: 4401


🏁 Script executed:

# Search for where recvmsg_vectored is called
rg "recvmsg_vectored" --type rust -B 2 -A 2

Repository: martin-augment/tokio

Length of output: 1954


🏁 Script executed:

# Look for SocketAncillary in tokio to see how ancillary buffers are used
rg "SocketAncillary" --type rust -B 2 -A 2

Repository: martin-augment/tokio

Length of output: 14324


🏁 Script executed:

# Check if there's documentation about alignment in unix socket code
rg "align" tokio/src/net/unix/ --type rust -i

Repository: martin-augment/tokio

Length of output: 46


🏁 Script executed:

# Look at the SocketAncillary implementation
cat -n tokio/src/net/unix/ancillary.rs | head -200

Repository: martin-augment/tokio

Length of output: 7500


🏁 Script executed:

# Search for as_mut_buffer method
rg "as_mut_buffer" --type rust -B 2 -A 5

Repository: martin-augment/tokio

Length of output: 3007


🏁 Script executed:

# Check if there's any alignment documentation in ancillary.rs
rg "align" tokio/src/net/unix/ancillary.rs --type rust -i -B 2 -A 2

Repository: martin-augment/tokio

Length of output: 46


🏁 Script executed:

# Look for buffer_size_for_rights or similar buffer sizing functions
rg "buffer_size" --type rust -B 2 -A 5

Repository: martin-augment/tokio

Length of output: 12918


🏁 Script executed:

# Look for SocketAncillary implementation in Rust std lib documentation or mentions
rg "msg_control" --type rust -B 3 -A 3

Repository: martin-augment/tokio

Length of output: 4123


🏁 Script executed:

# Check the complete buffer_size_for_rights implementation
cat -n tokio/src/net/unix/ancillary.rs | grep -A 10 "buffer_size_for_rights"

Repository: martin-augment/tokio

Length of output: 1086


🏁 Script executed:

# Look for any safety comments or docs about alignment in ancillary.rs
cat -n tokio/src/net/unix/ancillary.rs | tail -150

Repository: martin-augment/tokio

Length of output: 6446


🏁 Script executed:

# Check if there are any comments about alignment in the codebase
grep -r "align\|CMSG" tokio/src/net/unix/ --include="*.rs"

Repository: martin-augment/tokio

Length of output: 1804


🌐 Web query:

POSIX msg_control alignment requirement cmsghdr

💡 Result:

POSIX does not mandate a specific byte alignment value (e.g., 4 vs 8). Instead it specifies the requirement in terms of “whatever alignment is needed” for control messages, and provides macros that must be used to get it right.

What must be aligned

  • msghdr.msg_control must point to storage that is suitably aligned to contain a struct cmsghdr (i.e., the first control header must be correctly aligned for cmsghdr). A plain char control[...] object is not guaranteed to have that alignment. [6]
  • Each ancillary object in the buffer may require padding so that:
    • the next cmsghdr is correctly aligned, and
    • the data returned by CMSG_DATA() is correctly aligned for direct access. This is why POSIX defines CMSG_SPACE() as including “any padding needed to satisfy alignment requirements”. [1][3]

The portable rule: use the CMSG_* macros

POSIX intends you to treat alignment as opaque and rely on:

  • CMSG_SPACE(payload_len) for buffer sizing (includes padding/alignment) [1]
  • CMSG_LEN(payload_len) for cmsg->cmsg_len (accounts for required alignment) [2][3]
  • CMSG_FIRSTHDR, CMSG_NXTHDR, CMSG_DATA to walk/access messages safely with correct alignment. [1][3]

Linux exposes CMSG_ALIGN() too, but it’s explicitly a Linux extension (not POSIX). [2]

Common implementation detail (don’t hardcode)

Many libcs implement the padding/alignment in terms of sizeof(size_t) (or equivalent). For example, glibc’s CMSG_SPACE/CMSG_LEN are defined using CMSG_ALIGN(...). [4]

Practical pattern to satisfy alignment

Use a union (or malloc, which is sufficiently aligned) so the buffer is aligned for cmsghdr:

union {
  struct cmsghdr align;
  unsigned char buf[CMSG_SPACE(payload_len)];
} control;

msg.msg_control = control.buf;
msg.msg_controllen = sizeof(control.buf);

This “union for alignment” approach is widely recommended because it guarantees msg_control is aligned for cmsghdr. [5]


Sources: Open Group <sys/socket.h> (POSIX) on CMSG_SPACE padding/alignment [1]; Linux cmsg(3) portability notes (and CMSG_ALIGN being Linux-only) [2]; OpenBSD CMSG_* man page on meeting alignment constraints [3]; glibc header showing macro structure [4]; UNP note that msg_control must be aligned for cmsghdr and union technique [5]; documentation explicitly stating msg_control “must be aligned for a cmsghdr structure” [6].

References: [1] (pubs.opengroup.org) [2] (sources.debian.org) [3] (man.openbsd.org) [4] (sbexr.rabexc.org) [5] (litux.nl) [6] (digiater.nl)

Citations:


Document alignment requirement for ancillary buffer.

The ancillary_buf parameter in recvmsg_vectored is typed as &mut [u8], but msg_control must be aligned for struct cmsghdr. A plain byte array on the stack (e.g., [0u8; 128]) has no alignment guarantee beyond 1 byte and may violate POSIX alignment requirements, making it technically undefined behavior if misaligned.

While this works in practice (most stack frames are sufficiently aligned) and matches the pattern in std::os::unix::net::SocketAncillary (nightly), add documentation to SocketAncillary::new() noting the alignment requirement. Consider providing a helper type with #[repr(align(8))] or similar to give users a safe way to create properly-aligned buffers.

🤖 Prompt for AI Agents
In `@tokio/src/net/unix/cmsg.rs` around lines 40 - 68, The ancillary_buf passed
into recvmsg_vectored must be aligned for struct cmsghdr but is currently typed
as &mut [u8] without documentation or helper for proper alignment; update the
public API by adding a clear doc comment on SocketAncillary::new() stating that
the backing buffer must be aligned to the alignment of libc::cmsghdr (e.g., 8
bytes on common ABIs) and that using a plain [u8; N] may be misaligned, and add
a recommended safe helper type (e.g., a small #[repr(align(...))] wrapper) or
factory function in the docs so callers can construct properly-aligned ancillary
buffers; reference recvmsg_vectored, ancillary_buf, and SocketAncillary::new()
in the change so users and reviewers can find the impacted locations.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:bug; feedback: The CodeRabbit AI reviewer is correct! Using an attribute like #[repr(align(8))] would document the expected alignment of the msg_control struct and make the callers follow it or face a build issues.

@martin-augment
Copy link
Owner Author

287-290: messages() passes redundant length.

Messages::new receives both &self.buffer[..self.length] (a slice already truncated to self.length) and self.length as a separate parameter. Inside Messages::new, length is used to set msg_controllen, but buffer.len() would give the same value. Consider simplifying to only pass the slice, using buffer.len() internally.

value:good-to-have; category:bug; feedback: The CodeRabbit AI reviewer is correct! There is no need to pass the length argument since it should be the same as buffer.len(). Having the second argument makes it possible to pass invalid values for the length which may cause errors like reading less or too much.

@martin-augment
Copy link
Owner Author

martin-augment commented Feb 16, 2026

116-143: Recursive self.next() for unknown message types could overflow the stack.

Line 139 calls self.next() recursively to skip unrecognized cmsg types. While the number of messages is bounded by the buffer size, a loop is safer and clearer.

value:useful; category:security; feedback: The CodeRabbit AI reviewer is correct! A bad actor could send data with many unknown message types and this may lead to stack overflow error. By using an iterative loop this attack could be neutralised.

@martin-augment
Copy link
Owner Author

martin-augment commented Feb 16, 2026

7. Recursive call in Messages::next() (ancillary.rs:144-146)

} else {
    // Skip unknown control message types, try next
    self.next()
}

This recursion could cause stack overflow if there are many unknown message types (unlikely but theoretically possible on malicious input).

value:useful; category:security; feedback: The Claude AI reviewer is correct! A bad actor could send data with many unknown message types and this may lead to stack overflow error. By using an iterative loop this attack could be neutralised.

@martin-augment
Copy link
Owner Author

427-487: Test covers the happy path well.

Minor note: unlike send_recv_fd_over_stream in uds_stream.rs (line 491), this test doesn't assert !recv_ancillary.is_truncated(). Consider adding that check for consistency and to catch buffer sizing issues early.

value:good-to-have; category:bug; feedback: The Claude AI reviewer is correct! It would be useful to check that the received data is not truncated also for the UDP test case, not just for the Unix socket one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants