7911: feat(net): add Unix socket ancillary data API for fd passing (SCM_RIGHTS)#63
7911: feat(net): add Unix socket ancillary data API for fd passing (SCM_RIGHTS)#63martin-augment wants to merge 1 commit intomasterfrom
Conversation
…HTS) Add send_vectored_with_ancillary and recv_vectored_with_ancillary methods to UnixStream and UnixDatagram, enabling file descriptor passing over Unix sockets via sendmsg/recvmsg with SCM_RIGHTS control messages. New types: SocketAncillary, AncillaryData, ScmRights, Messages. Platform support: MSG_CMSG_CLOEXEC on Linux/FreeBSD, fcntl fallback on macOS. Closes tokio-rs#2975 fix: resolve CI warnings - add Debug impls, remove unused imports style: apply rustfmt formatting fix: clippy warnings
Summary of ChangesHello @martin-augment, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances Tokio's Unix socket capabilities by introducing a new API for sending and receiving ancillary data, primarily enabling the passing of file descriptors between processes. This feature allows for more robust and flexible inter-process communication by facilitating the transfer of open file handles, improving resource sharing and system design patterns. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
WalkthroughThis change introduces Unix socket ancillary data support to Tokio, enabling file descriptor passing via SCM_RIGHTS. A new ancillary module provides SocketAncillary buffer management, AncillaryData parsing, ScmRights iteration, and error handling. An internal cmsg module offers low-level sendmsg_vectored and recvmsg_vectored operations with cross-platform CLOEXEC handling. Four new public methods are added to both UnixStream and UnixDatagram: try_send_vectored_with_ancillary, send_vectored_with_ancillary, try_recv_vectored_with_ancillary, and recv_vectored_with_ancillary. Tests validate file descriptor transfer over both stream and datagram sockets. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Review: Unix Socket Ancillary Data API for FD Passing (SCM_RIGHTS)This PR adds support for passing file descriptors over Unix sockets using ✅ Strengths
🔍 Code Quality IssuesCritical: Memory Safety Concerns1. Potential undefined behavior in The let mut msg: libc::msghdr = unsafe { mem::zeroed() };
msg.msg_control = buffer.as_ptr() as *mut libc::c_void;
msg.msg_controllen = length as _;The Recommendation: Add a comment explaining why 2. Unchecked pointer arithmetic in let mut fd_bytes = [0u8; mem::size_of::<RawFd>()];
fd_bytes.copy_from_slice(&self.data[self.offset..self.offset + fd_size]);
Some(RawFd::from_ne_bytes(fd_bytes))The code assumes Issue: This code won't compile. Did you mean to use 3. Buffer size validation in The complex logic for finding the next
Recommendation: Add assertions or debug checks to verify buffer boundaries. High Priority: Resource Leak Prevention4. Missing RAII for received file descriptors (tests) In tests like for fd in &received_fds {
unsafe { libc::close(*fd) };
}Raw Recommendation: Wrap received FDs in use std::os::unix::io::{FromRawFd, OwnedFd};
let owned_fds: Vec<OwnedFd> = received_fds.into_iter()
.map(|fd| unsafe { OwnedFd::from_raw_fd(fd) })
.collect();
// OwnedFd handles cleanup automatically5. FD exhaustion in tests Tests create temporary files but don't always ensure cleanup on panic. Consider using RAII wrappers consistently. Medium Priority: Correctness Issues6. Truncation flag handling (ancillary.rs:289-291) pub fn is_truncated(&self) -> bool {
self.truncated
}When truncation occurs, received FDs that don't fit are lost forever. The API doesn't warn users about this strongly enough. Recommendation:
7. Recursive call in } else {
// Skip unknown control message types, try next
self.next()
}This recursion could cause stack overflow if there are many unknown message types (unlikely but theoretically possible on malicious input). Recommendation: Use a loop instead: fn next(&mut self) -> Option<Self::Item> {
loop {
if self.current.is_null() {
return None;
}
// ... parse and return if SCM_RIGHTS, otherwise continue
}
}Low Priority: API Improvements8. Missing Methods like 9. Integer truncation warnings Multiple uses of msg.msg_controllen = length as _;Consider explicit casts with comments explaining why they're safe. 10. Platform-specific code duplication The 🔒 Security Considerations
🚀 Performance Considerations
📋 Test CoverageExcellent coverage overall, but consider adding:
🎯 Recommendations SummaryMust fix before merge:
Should fix: Nice to have: Overall AssessmentThis is a solid implementation of a complex low-level feature. The core logic is sound, but there are some safety and correctness issues that should be addressed. The test coverage is excellent. With the fixes above, this would be production-ready. Recommendation: Request changes for the compilation error and safety documentation, then approve after fixes. Great work on tackling this challenging feature! The ancillary data API is well-designed and will be very useful for Unix socket users. 🎉 |
There was a problem hiding this comment.
Code Review
This pull request introduces a new API for sending and receiving file descriptors over Unix sockets using ancillary data (SCM_RIGHTS). The implementation is well-structured and includes comprehensive tests. However, the implementation contains critical memory safety vulnerabilities. Specifically, integer truncation in add_fds can lead to heap buffer overflows, and missing validation of cmsg_len in the message iterator can result in out-of-bounds reads or panics. Additionally, there's a potential stack overflow issue in the ancillary message iterator due to recursive calls for skipping unknown message types. These issues must be addressed before the PR is merged.
| let space = unsafe { libc::CMSG_SPACE(fd_bytes_len as libc::c_uint) as usize }; | ||
|
|
||
| if self.length + space > self.buffer.len() { | ||
| return Err(AncillaryError); | ||
| } | ||
|
|
||
| unsafe { | ||
| // Build a temporary msghdr pointing at our buffer to use CMSG macros | ||
| let mut msg: libc::msghdr = mem::zeroed(); | ||
| msg.msg_control = self.buffer.as_mut_ptr() as *mut libc::c_void; | ||
| msg.msg_controllen = (self.length + space) as _; | ||
|
|
||
| // Find the cmsg slot. If length is 0, use CMSG_FIRSTHDR. | ||
| // Otherwise walk to find the next available slot. | ||
| let cmsg = if self.length == 0 { | ||
| libc::CMSG_FIRSTHDR(&msg) | ||
| } else { | ||
| // Point msg_controllen at current length to walk existing headers | ||
| let mut walk_msg: libc::msghdr = mem::zeroed(); | ||
| walk_msg.msg_control = self.buffer.as_mut_ptr() as *mut libc::c_void; | ||
| walk_msg.msg_controllen = self.length as _; | ||
|
|
||
| let mut cur = libc::CMSG_FIRSTHDR(&walk_msg); | ||
| while !cur.is_null() { | ||
| let next = libc::CMSG_NXTHDR(&walk_msg, cur); | ||
| if next.is_null() { | ||
| // cur is the last header; the next slot is after it | ||
| break; | ||
| } | ||
| cur = next; | ||
| } | ||
| // Recalculate with expanded controllen | ||
| msg.msg_controllen = (self.length + space) as _; | ||
| if cur.is_null() { | ||
| libc::CMSG_FIRSTHDR(&msg) | ||
| } else { | ||
| libc::CMSG_NXTHDR(&msg, cur) | ||
| } | ||
| }; | ||
|
|
||
| if cmsg.is_null() { | ||
| return Err(AncillaryError); | ||
| } | ||
|
|
||
| (*cmsg).cmsg_level = libc::SOL_SOCKET; | ||
| (*cmsg).cmsg_type = libc::SCM_RIGHTS; | ||
| (*cmsg).cmsg_len = libc::CMSG_LEN(fd_bytes_len as libc::c_uint) as _; | ||
|
|
||
| let data_ptr = libc::CMSG_DATA(cmsg); | ||
| std::ptr::copy_nonoverlapping(fds.as_ptr() as *const u8, data_ptr, fd_bytes_len); |
There was a problem hiding this comment.
Critical: Heap Buffer Overflow in SocketAncillary::add_fds due to Integer Truncation
The add_fds method is vulnerable to a heap buffer overflow. The length of the file descriptors slice (fd_bytes_len) is a usize, but it is explicitly cast to libc::c_uint (typically 32-bit) when passed to libc::CMSG_SPACE and libc::CMSG_LEN.
If a caller provides a large number of file descriptors (e.g., such that fd_bytes_len exceeds u32::MAX), the truncated value will be used to calculate the required space and to set the cmsg_len field. This can result in the self.length + space > self.buffer.len() check passing even when the buffer is insufficient. Subsequently, std::ptr::copy_nonoverlapping is called with the original, non-truncated fd_bytes_len, leading to a massive out-of-bounds write into the heap.
This same issue also affects buffer_size_for_rights at line 215.
There was a problem hiding this comment.
value:useful; category:security; feedback: The Gemini AI reviewer is correct! A bad actor could send data with too big value for fd_bytes_len that could cause integer overflow calculation and undefined behavior of any following usage of the result.
| cmsg.cmsg_len as usize - (data_ptr as usize - cmsg as *const _ as usize); | ||
| let data = std::slice::from_raw_parts(data_ptr, data_len); |
There was a problem hiding this comment.
The Messages::next iterator contains a critical out-of-bounds read vulnerability. Missing validation of the cmsg_len field can lead to integer underflow if cmsg_len is smaller than the header size, resulting in an extremely large data_len. This can cause out-of-bounds reads or panics when std::slice::from_raw_parts is used, especially with truncated or malicious control messages. Additionally, the recursive call in next() to skip unknown control message types can lead to a stack overflow with many unknown messages, posing a denial-of-service risk. An iterative approach is recommended to prevent this.
There was a problem hiding this comment.
value:useful; category:security; feedback: The Gemini AI reviewer is correct! A bad actor could send data with intentionally smaller value for cmsg_len than the header offset and this will cause underflow integer calculation. Prevents a security vulnerability when processing unchecked input.
🤖 Augment PR SummarySummary: Adds a Unix-socket ancillary data API to Tokio to enable file descriptor passing via Changes:
Technical Notes: Receive paths propagate ancillary truncation via 🤖 Was this summary useful? React with 👍 or 👎 |
| return Err(AncillaryError); | ||
| } | ||
|
|
||
| (*cmsg).cmsg_level = libc::SOL_SOCKET; |
There was a problem hiding this comment.
Dereferencing cmsg assumes the backing &mut [u8] is properly aligned for libc::cmsghdr, but a u8 buffer doesn’t guarantee that alignment, so this can be UB on some targets. Consider documenting/enforcing an aligned buffer requirement (the docs currently show let mut buf = [0u8; 128];).
Severity: high
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
value:good-to-have; category:bug; feedback: The Augment AI reviewer is correct! Using an attribute like #[repr(align(8))] would document the expected alignment of the msg_control struct and make the callers follow it or face a build issues.
| if cmsg.cmsg_level == libc::SOL_SOCKET && cmsg.cmsg_type == libc::SCM_RIGHTS { | ||
| let data_ptr = libc::CMSG_DATA(cmsg as *const _ as *mut _); | ||
| #[allow(clippy::unnecessary_cast)] | ||
| let data_len = |
There was a problem hiding this comment.
data_len is computed via a subtraction that can underflow if cmsg_len is smaller than the header/data offset, which would make from_raw_parts create an out-of-bounds slice. Even if the kernel typically validates control messages, it may be worth defensively ensuring this computation can’t underflow on malformed/truncated ancillary data.
Severity: high
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
value:useful; category:security; feedback: The Augment AI reviewer is correct! A bad actor could send data with intentionally smaller value for cmsg_len than the header offset and this will cause underflow integer calculation. Prevents a security vulnerability when processing unchecked input.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tokio/src/net/unix/cmsg.rs`:
- Around line 40-68: The ancillary_buf passed into recvmsg_vectored must be
aligned for struct cmsghdr but is currently typed as &mut [u8] without
documentation or helper for proper alignment; update the public API by adding a
clear doc comment on SocketAncillary::new() stating that the backing buffer must
be aligned to the alignment of libc::cmsghdr (e.g., 8 bytes on common ABIs) and
that using a plain [u8; N] may be misaligned, and add a recommended safe helper
type (e.g., a small #[repr(align(...))] wrapper) or factory function in the docs
so callers can construct properly-aligned ancillary buffers; reference
recvmsg_vectored, ancillary_buf, and SocketAncillary::new() in the change so
users and reviewers can find the impacted locations.
🧹 Nitpick comments (4)
tokio/tests/uds_datagram.rs (1)
427-487: Test covers the happy path well.Minor note: unlike
send_recv_fd_over_streaminuds_stream.rs(line 491), this test doesn't assert!recv_ancillary.is_truncated(). Consider adding that check for consistency and to catch buffer sizing issues early.Optional: add truncation assertion
assert_eq!(n, 5); assert_eq!(&recv_buf[..5], b"dgram"); + assert!(!recv_ancillary.is_truncated()); let mut received_fds: Vec<RawFd> = Vec::new();tokio/src/net/unix/datagram/socket.rs (1)
695-742: Document partial-write behavior for ancillary data on stream sockets.With
SOCK_STREAM,sendmsgcan return a short write (fewer bytes than the total iov length). The kernel attaches the ancillary data to the first byte sent. If a caller sees a partial write and retriessend_vectored_with_ancillarywith the remaining data without clearing the ancillary buffer, the FDs will be sent twice.This is inherent to the
sendmsgAPI, but it would be helpful to document in the method's doc comment that callers should callancillary.clear()after a successful send before sending any remaining data. This applies toUnixStreamas well (seetokio/src/net/unix/stream.rslines 796-812).tokio/src/net/unix/ancillary.rs (2)
116-143: Recursiveself.next()for unknown message types could overflow the stack.Line 139 calls
self.next()recursively to skip unrecognized cmsg types. While the number of messages is bounded by the buffer size, a loop is safer and clearer.♻️ Refactor to iterative approach
fn next(&mut self) -> Option<Self::Item> { - if self.current.is_null() { - return None; - } - - unsafe { - let cmsg = &*self.current; - - // Advance to next header for the next call - self.current = libc::CMSG_NXTHDR(&self.msg, self.current); - - if cmsg.cmsg_level == libc::SOL_SOCKET && cmsg.cmsg_type == libc::SCM_RIGHTS { - let data_ptr = libc::CMSG_DATA(cmsg as *const _ as *mut _); - #[allow(clippy::unnecessary_cast)] - let data_len = - cmsg.cmsg_len as usize - (data_ptr as usize - cmsg as *const _ as usize); - let data = std::slice::from_raw_parts(data_ptr, data_len); - Some(AncillaryData::ScmRights(ScmRights::new(data))) - } else { - // Skip unknown control message types, try next - self.next() - } - } + loop { + if self.current.is_null() { + return None; + } + + unsafe { + let cmsg = &*self.current; + + // Advance to next header for the next call + self.current = libc::CMSG_NXTHDR(&self.msg, self.current); + + if cmsg.cmsg_level == libc::SOL_SOCKET && cmsg.cmsg_type == libc::SCM_RIGHTS { + let data_ptr = libc::CMSG_DATA(cmsg as *const _ as *mut _); + #[allow(clippy::unnecessary_cast)] + let data_len = + cmsg.cmsg_len as usize - (data_ptr as usize - cmsg as *const _ as usize); + let data = std::slice::from_raw_parts(data_ptr, data_len); + return Some(AncillaryData::ScmRights(ScmRights::new(data))); + } + // Skip unknown control message types, continue loop + } + } }
287-290:messages()passes redundant length.
Messages::newreceives both&self.buffer[..self.length](a slice already truncated toself.length) andself.lengthas a separate parameter. InsideMessages::new,lengthis used to setmsg_controllen, butbuffer.len()would give the same value. Consider simplifying to only pass the slice, usingbuffer.len()internally.
| /// Perform a non-blocking `recvmsg` with vectored data and ancillary data. | ||
| pub(crate) fn recvmsg_vectored( | ||
| fd: RawFd, | ||
| iov: &mut [io::IoSliceMut<'_>], | ||
| ancillary_buf: &mut [u8], | ||
| ) -> io::Result<RecvMsgResult> { | ||
| unsafe { | ||
| let mut msg: libc::msghdr = std::mem::zeroed(); | ||
| msg.msg_iov = iov.as_mut_ptr() as *mut libc::iovec; | ||
| msg.msg_iovlen = iov.len() as _; | ||
| msg.msg_control = ancillary_buf.as_mut_ptr() as *mut libc::c_void; | ||
| msg.msg_controllen = ancillary_buf.len() as _; | ||
|
|
||
| let flags = recv_flags(); | ||
|
|
||
| let ret = libc::recvmsg(fd, &mut msg, flags); | ||
| if ret < 0 { | ||
| Err(io::Error::last_os_error()) | ||
| } else { | ||
| #[allow(clippy::unnecessary_cast)] | ||
| let ancillary_len = msg.msg_controllen as usize; | ||
| Ok(RecvMsgResult { | ||
| bytes_read: ret as usize, | ||
| ancillary_len, | ||
| truncated: (msg.msg_flags & libc::MSG_CTRUNC) != 0, | ||
| }) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, examine the full cmsg.rs file to understand the context
cd tokio && wc -l src/net/unix/cmsg.rsRepository: martin-augment/tokio
Length of output: 88
🏁 Script executed:
# Read the cmsg.rs file
cat -n src/net/unix/cmsg.rs | head -150Repository: martin-augment/tokio
Length of output: 116
🏁 Script executed:
# Search for where recvmsg_vectored is called
rg "recvmsg_vectored" --type rs -B 2 -A 2Repository: martin-augment/tokio
Length of output: 90
🏁 Script executed:
# Look for SocketAncillary in tokio to see how ancillary buffers are used
rg "SocketAncillary" --type rs -B 2 -A 2Repository: martin-augment/tokio
Length of output: 90
🏁 Script executed:
# Check if there's documentation about alignment in unix socket code
rg "align" src/net/unix/ --type rs -iRepository: martin-augment/tokio
Length of output: 90
🏁 Script executed:
# Read the cmsg.rs file
cat -n tokio/src/net/unix/cmsg.rsRepository: martin-augment/tokio
Length of output: 4401
🏁 Script executed:
# Search for where recvmsg_vectored is called
rg "recvmsg_vectored" --type rust -B 2 -A 2Repository: martin-augment/tokio
Length of output: 1954
🏁 Script executed:
# Look for SocketAncillary in tokio to see how ancillary buffers are used
rg "SocketAncillary" --type rust -B 2 -A 2Repository: martin-augment/tokio
Length of output: 14324
🏁 Script executed:
# Check if there's documentation about alignment in unix socket code
rg "align" tokio/src/net/unix/ --type rust -iRepository: martin-augment/tokio
Length of output: 46
🏁 Script executed:
# Look at the SocketAncillary implementation
cat -n tokio/src/net/unix/ancillary.rs | head -200Repository: martin-augment/tokio
Length of output: 7500
🏁 Script executed:
# Search for as_mut_buffer method
rg "as_mut_buffer" --type rust -B 2 -A 5Repository: martin-augment/tokio
Length of output: 3007
🏁 Script executed:
# Check if there's any alignment documentation in ancillary.rs
rg "align" tokio/src/net/unix/ancillary.rs --type rust -i -B 2 -A 2Repository: martin-augment/tokio
Length of output: 46
🏁 Script executed:
# Look for buffer_size_for_rights or similar buffer sizing functions
rg "buffer_size" --type rust -B 2 -A 5Repository: martin-augment/tokio
Length of output: 12918
🏁 Script executed:
# Look for SocketAncillary implementation in Rust std lib documentation or mentions
rg "msg_control" --type rust -B 3 -A 3Repository: martin-augment/tokio
Length of output: 4123
🏁 Script executed:
# Check the complete buffer_size_for_rights implementation
cat -n tokio/src/net/unix/ancillary.rs | grep -A 10 "buffer_size_for_rights"Repository: martin-augment/tokio
Length of output: 1086
🏁 Script executed:
# Look for any safety comments or docs about alignment in ancillary.rs
cat -n tokio/src/net/unix/ancillary.rs | tail -150Repository: martin-augment/tokio
Length of output: 6446
🏁 Script executed:
# Check if there are any comments about alignment in the codebase
grep -r "align\|CMSG" tokio/src/net/unix/ --include="*.rs"Repository: martin-augment/tokio
Length of output: 1804
🌐 Web query:
POSIX msg_control alignment requirement cmsghdr
💡 Result:
POSIX does not mandate a specific byte alignment value (e.g., 4 vs 8). Instead it specifies the requirement in terms of “whatever alignment is needed” for control messages, and provides macros that must be used to get it right.
What must be aligned
msghdr.msg_controlmust point to storage that is suitably aligned to contain astruct cmsghdr(i.e., the first control header must be correctly aligned forcmsghdr). A plainchar control[...]object is not guaranteed to have that alignment. [6]- Each ancillary object in the buffer may require padding so that:
- the next
cmsghdris correctly aligned, and - the data returned by
CMSG_DATA()is correctly aligned for direct access. This is why POSIX definesCMSG_SPACE()as including “any padding needed to satisfy alignment requirements”. [1][3]
- the next
The portable rule: use the CMSG_* macros
POSIX intends you to treat alignment as opaque and rely on:
CMSG_SPACE(payload_len)for buffer sizing (includes padding/alignment) [1]CMSG_LEN(payload_len)forcmsg->cmsg_len(accounts for required alignment) [2][3]CMSG_FIRSTHDR,CMSG_NXTHDR,CMSG_DATAto walk/access messages safely with correct alignment. [1][3]
Linux exposes CMSG_ALIGN() too, but it’s explicitly a Linux extension (not POSIX). [2]
Common implementation detail (don’t hardcode)
Many libcs implement the padding/alignment in terms of sizeof(size_t) (or equivalent). For example, glibc’s CMSG_SPACE/CMSG_LEN are defined using CMSG_ALIGN(...). [4]
Practical pattern to satisfy alignment
Use a union (or malloc, which is sufficiently aligned) so the buffer is aligned for cmsghdr:
union {
struct cmsghdr align;
unsigned char buf[CMSG_SPACE(payload_len)];
} control;
msg.msg_control = control.buf;
msg.msg_controllen = sizeof(control.buf);This “union for alignment” approach is widely recommended because it guarantees msg_control is aligned for cmsghdr. [5]
Sources: Open Group <sys/socket.h> (POSIX) on CMSG_SPACE padding/alignment [1]; Linux cmsg(3) portability notes (and CMSG_ALIGN being Linux-only) [2]; OpenBSD CMSG_* man page on meeting alignment constraints [3]; glibc header showing macro structure [4]; UNP note that msg_control must be aligned for cmsghdr and union technique [5]; documentation explicitly stating msg_control “must be aligned for a cmsghdr structure” [6].
References: [1] (pubs.opengroup.org) [2] (sources.debian.org) [3] (man.openbsd.org) [4] (sbexr.rabexc.org) [5] (litux.nl) [6] (digiater.nl)
Citations:
- 1: https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/sys_socket.h.html?utm_source=openai
- 2: https://sources.debian.org/src/manpages/4.10-2~bpo8%2B1/man3/cmsg.3?utm_source=openai
- 3: https://man.openbsd.org/OpenBSD-6.0/CMSG_DATA.3?utm_source=openai
- 4: https://sbexr.rabexc.org/latest/sources/48/3f7aa72c865458.html?utm_source=openai
- 5: https://litux.nl/mirror/unixnetworkprogramming/0131411551_ch15lev1sec7.html?utm_source=openai
- 6: https://www.digiater.nl/openvms/doc/alpha-v8.3/83final/6529/6529pro_009.html?utm_source=openai
Document alignment requirement for ancillary buffer.
The ancillary_buf parameter in recvmsg_vectored is typed as &mut [u8], but msg_control must be aligned for struct cmsghdr. A plain byte array on the stack (e.g., [0u8; 128]) has no alignment guarantee beyond 1 byte and may violate POSIX alignment requirements, making it technically undefined behavior if misaligned.
While this works in practice (most stack frames are sufficiently aligned) and matches the pattern in std::os::unix::net::SocketAncillary (nightly), add documentation to SocketAncillary::new() noting the alignment requirement. Consider providing a helper type with #[repr(align(8))] or similar to give users a safe way to create properly-aligned buffers.
🤖 Prompt for AI Agents
In `@tokio/src/net/unix/cmsg.rs` around lines 40 - 68, The ancillary_buf passed
into recvmsg_vectored must be aligned for struct cmsghdr but is currently typed
as &mut [u8] without documentation or helper for proper alignment; update the
public API by adding a clear doc comment on SocketAncillary::new() stating that
the backing buffer must be aligned to the alignment of libc::cmsghdr (e.g., 8
bytes on common ABIs) and that using a plain [u8; N] may be misaligned, and add
a recommended safe helper type (e.g., a small #[repr(align(...))] wrapper) or
factory function in the docs so callers can construct properly-aligned ancillary
buffers; reference recvmsg_vectored, ancillary_buf, and SocketAncillary::new()
in the change so users and reviewers can find the impacted locations.
There was a problem hiding this comment.
value:good-to-have; category:bug; feedback: The CodeRabbit AI reviewer is correct! Using an attribute like #[repr(align(8))] would document the expected alignment of the msg_control struct and make the callers follow it or face a build issues.
value:good-to-have; category:bug; feedback: The CodeRabbit AI reviewer is correct! There is no need to pass the |
value:useful; category:security; feedback: The CodeRabbit AI reviewer is correct! A bad actor could send data with many unknown message types and this may lead to stack overflow error. By using an iterative loop this attack could be neutralised. |
value:useful; category:security; feedback: The Claude AI reviewer is correct! A bad actor could send data with many unknown message types and this may lead to stack overflow error. By using an iterative loop this attack could be neutralised. |
value:good-to-have; category:bug; feedback: The Claude AI reviewer is correct! It would be useful to check that the received data is not truncated also for the UDP test case, not just for the Unix socket one. |
7911: To review by AI