Skip to content

7793: implement io-uring for read_to_string.rs#40

Open
martin-augment wants to merge 1 commit intomasterfrom
pr-7793-2025-12-29-10-14-25
Open

7793: implement io-uring for read_to_string.rs#40
martin-augment wants to merge 1 commit intomasterfrom
pr-7793-2025-12-29-10-14-25

Conversation

@martin-augment
Copy link
Owner

7793: To review by AI

@coderabbitai
Copy link

coderabbitai bot commented Dec 29, 2025

Walkthrough

The change modifies the read_to_string implementation in tokio/src/fs/read_to_string.rs to introduce a conditional fast-path for io-uring on Linux when specific tokio features are enabled. This path uses read_uring with a borrowed Path reference instead of always allocating a PathBuf upfront. For non-io-uring paths, a new private helper function read_to_string_spawn_blocking is introduced that converts the borrowed path to an owned Path and delegates to an async wrapper around std::fs::read_to_string. The path binding is adjusted to borrow rather than take ownership, enabling the io-uring path to operate without forcing immediate PathBuf allocation.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pr-7793-2025-12-29-10-14-25

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link

claude bot commented Dec 29, 2025

Pull Request Review: Implement io-uring for read_to_string.rs

I've reviewed PR #40 and have the following feedback:

Positives

  1. Consistent with existing patterns: The implementation follows the same pattern used in fs::read() (tokio/src/fs/read.rs:56-77), which is excellent for maintainability.

  2. Proper feature gating: The io-uring code is correctly gated behind the appropriate compile-time flags (tokio_unstable, io-uring, rt, fs, and target_os = "linux").

  3. Graceful fallback: The code properly falls back to spawn_blocking when io-uring is unavailable, maintaining compatibility.

  4. Error handling: UTF-8 validation with proper error type (InvalidData) and message matches standard library behavior.

🔍 Issues and Concerns

1. Critical Bug: Error discarding UTF-8 error details (tokio/src/fs/read_to_string.rs:44)

String::from_utf8(vec).map_err(|_| {
    io::Error::new(
        io::ErrorKind::InvalidData,
        "stream did not contain valid UTF-8",
    )
})

Problem: The original FromUtf8Error is discarded with |_|, losing valuable debugging information about where the invalid UTF-8 was found.

Recommendation: Preserve the error details:

String::from_utf8(vec).map_err(|e| {
    io::Error::new(
        io::ErrorKind::InvalidData,
        e.utf8_error(),
    )
})

This matches the pattern in tokio-util/src/codec/mod.rs:203-208 and provides better debugging information.

2. Ownership/Lifetime Issue (tokio/src/fs/read_to_string.rs:28, 54)

The current implementation has a subtle lifetime inconsistency:

  • Line 28: let path = path.as_ref(); (borrows as &Path)
  • Line 54: read_to_string_spawn_blocking(path).await (passes borrowed reference)
  • Line 58: let path = path.to_owned(); (converts to owned PathBuf)

Concern: While this works, it's slightly inefficient because:

  1. In the io-uring path, we pass &Path to read_uring(path) which immediately converts it to owned in OpenOptions::open(path)
  2. In the fallback path, we clone the path every time, even though the borrow is no longer needed after line 54

Recommendation: Consider cloning once at the start for consistency with the original implementation and fs::read():

pub async fn read_to_string(path: impl AsRef<Path>) -> io::Result<String> {
    let path = path.as_ref().to_owned();  // Clone once upfront
    
    #[cfg(all(...))]
    {
        use crate::fs::read_uring;
        
        let handle = crate::runtime::Handle::current();
        let driver_handle = handle.inner.driver().io();
        if driver_handle.check_and_init()? {
            return read_uring(&path).await.and_then(|vec| {
                String::from_utf8(vec).map_err(|e| {
                    io::Error::new(io::ErrorKind::InvalidData, e.utf8_error())
                })
            });
        }
    }
    
    asyncify(move || std::fs::read_to_string(path)).await
}

This eliminates the helper function and matches fs::read() more closely (see tokio/src/fs/read.rs:57).

Alternative: If you prefer the current borrowed approach for micro-optimization, document why this differs from fs::read().

3. Documentation needs update (tokio/src/fs/read_to_string.rs:11)

The documentation states:

"This operation is implemented by running the equivalent blocking operation on a separate thread pool using spawn_blocking."

This is now inaccurate when io-uring is enabled.

Recommendation: Add an io-uring section similar to fs::read() (see tokio/src/fs/read.rs:33-41):

/// # io_uring support
///
/// On Linux, you can also use io_uring for executing system calls. To enable
/// io_uring, you need to specify the `--cfg tokio_unstable` flag at compile time,
/// enable the io-uring cargo feature, and set the `Builder::enable_io_uring`
/// runtime option.
///
/// Support for io_uring is currently experimental, so its behavior may change
/// or it may be removed in future versions.

📊 Test Coverage

Missing: No specific tests for the io-uring path in read_to_string.

Observation: The existing tests in tokio/tests/io_read_to_string.rs test the AsyncReadExt::read_to_string trait method, not fs::read_to_string. Tests in tokio/tests/fs_write.rs:14 use fs::read_to_string but only for happy path.

Recommendation:

  • Add UTF-8 validation tests for the fs::read_to_string function specifically
  • Consider adding io-uring-specific integration tests (though this may be tested indirectly through existing fs tests)

🎯 Performance Considerations

The io-uring implementation should provide better performance for:

  • Large files (reduced context switching)
  • Multiple concurrent file operations (better parallelism)

No concerns with the implementation itself.

🔒 Security

No security issues identified. The UTF-8 validation properly prevents invalid data from being returned as a String.


Summary

This is a solid implementation that correctly adds io-uring support to read_to_string. The main issues are:

  1. Critical: Fix the UTF-8 error handling to preserve error details (line 44)
  2. Important: Update documentation to mention io-uring support
  3. Consider: Align with fs::read() pattern by cloning path upfront, or document the optimization choice

Great work on maintaining consistency with the existing fs::read() implementation! 🎉

🤖 Generated with Claude Code

@augmentcode
Copy link

augmentcode bot commented Dec 29, 2025

🤖 Augment PR Summary

Summary: Adds an io-uring fast path for tokio::fs::read_to_string on Linux when unstable io-uring support is enabled.

Changes:

  • Uses runtime IO driver detection/initialization (check_and_init) to decide whether to use io-uring.
  • When available, reads the file via crate::fs::read_uring and converts the bytes to String.
  • Falls back to the existing blocking-threadpool implementation via asyncify/spawn_blocking.
  • Refactors the blocking path into a small helper (read_to_string_spawn_blocking).

Technical Notes: The io-uring path is gated behind tokio_unstable, io-uring, rt, fs, and target_os = "linux".

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

let path = path.as_ref().to_owned();
let path = path.as_ref();

#[cfg(all(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment above says this always runs the blocking operation via spawn_blocking, but with this new io-uring fast path that’s no longer always true; consider updating the docs to reflect the conditional behavior.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:documentation; feedback:The Augment AI reviewer is correct! The documentation needs to be updated that if Linux IO_Uring is enabled then no new thread will be spawned. Prevents releasing an obsolete rustdoc for the read_to_string function.

let driver_handle = handle.inner.driver().io();
if driver_handle.check_and_init()? {
return read_uring(path).await.and_then(|vec| {
String::from_utf8(vec).map_err(|_| {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map_err(|_| ...) discards the FromUtf8Error, which makes this error less informative than the std::fs::read_to_string fallback (and may change error chaining). Consider preserving the underlying UTF-8 error as the IO error source.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Augment AI reviewer is correct! The original error is ignored and a new error message is used. Utf8Error's Display impl is better because it knows when the bytes contain invalid data or incomplete utf8 sequence. Also using "stream" here is not correct because there is no streaming at all. Prevents returning a misleading error message in case of a utf8 related problem.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tokio/src/fs/read_to_string.rs (1)

10-11: Documentation now inaccurate with io-uring path.

The docstring states the operation is "implemented by running the equivalent blocking operation on a separate thread pool using spawn_blocking", but with the io-uring path enabled, this is no longer always true.

🔎 Suggested documentation update
-/// This operation is implemented by running the equivalent blocking operation
-/// on a separate thread pool using [`spawn_blocking`].
+/// By default, this operation is implemented by running the equivalent blocking
+/// operation on a separate thread pool using [`spawn_blocking`]. When the
+/// `io-uring` feature is enabled on Linux, this may use io-uring instead.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a68edc4 and c8508f1.

📒 Files selected for processing (1)
  • tokio/src/fs/read_to_string.rs
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: martin-augment
Repo: martin-augment/tokio PR: 10
File: tokio-util/src/io/mod.rs:17-17
Timestamp: 2025-10-30T14:46:01.443Z
Learning: In tokio-util, the entire `io` module at `tokio-util/src/lib.rs` is already gated with `cfg_io!` macro, which applies `#[cfg(feature = "io")]` to all contents. Submodules declared within `tokio-util/src/io/mod.rs` inherit this feature gate and don't need additional `cfg_io!` wrapping unless they have specific requirements. The `cfg_io_util!` macro in the same file is for items that specifically require the `io-util` feature, not the general `io` feature.
🧬 Code graph analysis (1)
tokio/src/fs/read_to_string.rs (2)
tokio/src/fs/read_uring.rs (1)
  • read_uring (19-37)
tokio/src/runtime/handle.rs (1)
  • current (144-148)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Cursor Bugbot
  • GitHub Check: claude-review
🔇 Additional comments (2)
tokio/src/fs/read_to_string.rs (2)

57-59: LGTM!

The helper function correctly converts the borrowed path to an owned PathBuf before passing to asyncify, maintaining the existing spawn_blocking behavior.


27-28: LGTM!

Borrowing the path upfront (path.as_ref()) is a clean optimization that defers the allocation until the spawn_blocking path needs it, while allowing io-uring to work with the borrowed reference directly.

Comment on lines +40 to +51
let handle = crate::runtime::Handle::current();
let driver_handle = handle.inner.driver().io();
if driver_handle.check_and_init()? {
return read_uring(path).await.and_then(|vec| {
String::from_utf8(vec).map_err(|_| {
io::Error::new(
io::ErrorKind::InvalidData,
"stream did not contain valid UTF-8",
)
})
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how check_and_init() is used elsewhere and what errors it can return
rg -n -C5 'check_and_init' --type rust

Repository: martin-augment/tokio

Length of output: 4919


🏁 Script executed:

#!/bin/bash
# Look for try_init implementation and understand error handling
rg -n -B2 -A15 'fn try_init' --type rust | head -80

Repository: martin-augment/tokio

Length of output: 2735


🏁 Script executed:

#!/bin/bash
# Check the error message in read_to_string.rs
sed -n '35,55p' tokio/src/fs/read_to_string.rs

Repository: martin-augment/tokio

Length of output: 682


Update error message terminology from "stream" to "file".

The error message at line 47 says "stream did not contain valid UTF-8", but this is file I/O — it should say "file did not contain valid UTF-8" for consistency and accuracy.

🤖 Prompt for AI Agents
In tokio/src/fs/read_to_string.rs around lines 40 to 51, the error message
returned on UTF-8 conversion uses the word "stream" but this is file I/O; update
the io::Error message text to say "file did not contain valid UTF-8" instead of
"stream did not contain valid UTF-8" so the wording is accurate and consistent.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The CodeRabbit AI reviewer is correct! The original error is ignored and a new error message is used. Utf8Error's Display impl is better because it knows when the bytes contain invalid data or incomplete utf8 sequence. Also using "stream" here is not correct because there is no streaming at all. Prevents returning a misleading error message in case of a utf8 related problem.

@martin-augment
Copy link
Owner Author

1. Critical Bug: Error discarding UTF-8 error details (tokio/src/fs/read_to_string.rs:44)

String::from_utf8(vec).map_err(|_| {
    io::Error::new(
        io::ErrorKind::InvalidData,
        "stream did not contain valid UTF-8",
    )
})

Problem: The original FromUtf8Error is discarded with |_|, losing valuable debugging information about where the invalid UTF-8 was found.

value:useful; category:bug; feedback:The Claude AI reviewer is correct! The original error is ignored and a new error message is used. Utf8Error's Display impl is better because it knows when the bytes contain invalid data or incomplete utf8 sequence. Also using "stream" here is not correct because there is no streaming at all. Prevents returning a misleading error message in case of a utf8 related problem.

@martin-augment
Copy link
Owner Author

3. Documentation needs update (tokio/src/fs/read_to_string.rs:11)

The documentation states:

"This operation is implemented by running the equivalent blocking operation on a separate thread pool using spawn_blocking."

This is now inaccurate when io-uring is enabled.

value:good-to-have; category:documentation; feedback:The Claude AI reviewer is correct! The documentation needs to be updated that if Linux IO_Uring is enabled then no new thread will be spawned. Prevents releasing an obsolete rustdoc for the read_to_string function.

@martin-augment
Copy link
Owner Author

10-11: Documentation now inaccurate with io-uring path.

The docstring states the operation is "implemented by running the equivalent blocking operation on a separate thread pool using spawn_blocking", but with the io-uring path enabled, this is no longer always true.

value:good-to-have; category:documentation; feedback:The CodeRabbit AI reviewer is correct! The documentation needs to be updated that if Linux IO_Uring is enabled then no new thread will be spawned. Prevents releasing an obsolete rustdoc for the read_to_string function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments