Skip to content

7834: runtime: avoid redundant unpark in current_thread scheduler#44

Open
martin-augment wants to merge 2 commits intomasterfrom
pr-7834-2026-01-04-20-21-36
Open

7834: runtime: avoid redundant unpark in current_thread scheduler#44
martin-augment wants to merge 2 commits intomasterfrom
pr-7834-2026-01-04-20-21-36

Conversation

@martin-augment
Copy link
Owner

7834: To review by AI

Checking if there is pending work in `park()` ensures that if a
`before_park` hook populates the deferred queue (e.g. via
`yield_now()`), the runtime skips the blocking sleep.

This also refactors the loop to use a common `has_pending_work` helper.
@coderabbitai
Copy link

coderabbitai bot commented Jan 4, 2026

Walkthrough

The changes modify Tokio's current-thread runtime scheduler to improve parking and waking behavior. A new internal helper method consolidates checks for pending work across the scheduler. The parking logic is updated to use this consolidated check, and the wake_on_fork mechanism is hardened to avoid redundant unpark operations when the current thread already holds the wake signal. Atomic Ordering imports are extended to support the new synchronization requirements. Test additions and modifications validate the updated scheduling behavior for cross-thread waking and before-park hooks.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a80b4d3 and d2f6cfa.

📒 Files selected for processing (4)
  • tokio/src/runtime/scheduler/current_thread/mod.rs
  • tokio/tests/rt_basic.rs
  • tokio/tests/rt_common_before_park.rs
  • tokio/tests/rt_metrics.rs
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: martin-augment
Repo: martin-augment/tokio PR: 11
File: tokio/src/util/usdt/stapsdt_aarch64.rs:1-3
Timestamp: 2025-11-04T09:48:44.541Z
Learning: In USDT (User Statically-Defined Tracing) probe implementations in Rust, semaphores must be declared as `static mut` primitive types (e.g., `static mut __usdt_sema_tokio_task__details: u16`), not atomic types. This is required for ABI compatibility with external tracing tools like DTrace and SystemTap, which modify these semaphores directly. The inline assembly code references the raw address of these semaphores, and using Rust atomic types would break the USDT mechanism.
📚 Learning: 2025-11-04T09:48:44.541Z
Learnt from: martin-augment
Repo: martin-augment/tokio PR: 11
File: tokio/src/util/usdt/stapsdt_aarch64.rs:1-3
Timestamp: 2025-11-04T09:48:44.541Z
Learning: In USDT (User Statically-Defined Tracing) probe implementations in Rust, semaphores must be declared as `static mut` primitive types (e.g., `static mut __usdt_sema_tokio_task__details: u16`), not atomic types. This is required for ABI compatibility with external tracing tools like DTrace and SystemTap, which modify these semaphores directly. The inline assembly code references the raw address of these semaphores, and using Rust atomic types would break the USDT mechanism.

Applied to files:

  • tokio/tests/rt_metrics.rs
  • tokio/src/runtime/scheduler/current_thread/mod.rs
  • tokio/tests/rt_common_before_park.rs
🧬 Code graph analysis (3)
tokio/tests/rt_metrics.rs (1)
tokio/src/runtime/handle.rs (1)
  • metrics (478-480)
tokio/tests/rt_basic.rs (4)
tokio/src/util/wake.rs (1)
  • Arc (34-34)
tokio/src/runtime/scheduler/current_thread/mod.rs (2)
  • wake_by_ref (726-737)
  • new (128-182)
tokio/tests/sync_notify.rs (1)
  • wake_by_ref (194-194)
tokio/src/runtime/builder.rs (2)
  • new (270-330)
  • new_current_thread (246-254)
tokio/tests/rt_common_before_park.rs (4)
tokio/tests/tracing_sync.rs (1)
  • sync (183-183)
tokio/src/util/wake.rs (1)
  • Arc (34-34)
tokio/src/runtime/task/trace/mod.rs (1)
  • runtime (273-292)
tokio/src/runtime/scheduler/current_thread/mod.rs (2)
  • new (128-182)
  • spawn (463-486)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Cursor Bugbot
  • GitHub Check: claude-review
🔇 Additional comments (9)
tokio/src/runtime/scheduler/current_thread/mod.rs (4)

18-18: LGTM - Correct memory ordering for synchronization.

The addition of Acquire ordering is necessary for the new has_pending_work method and forms a proper synchronization pair with the Release ordering in wake_by_ref.


385-385: LGTM - Improved parking logic with comprehensive pending work check.

The new has_pending_work helper correctly consolidates all three sources of pending work: local tasks, deferred tasks (from yield_now()), and wake signals. This prevents parking when the before_park hook defers work or signals a wake, addressing the PR objective to skip parking when a hook defers the waker.

The use of Acquire ordering when loading woken properly synchronizes with the Release store in wake_by_ref (line 727).

Also applies to: 417-419


790-790: LGTM - Consistent use of consolidated pending work check.

The main scheduling loop now uses the same has_pending_work check as the park() method, ensuring consistent behavior when deciding between park_yield (non-blocking) and park (blocking).


727-736: LGTM - Correct optimization to eliminate redundant unparks.

The new implementation cleverly avoids redundant unpark operations:

  • swap(true, Release) atomically sets the wake flag and returns the previous value
  • Only unparks if not previously woken AND not on the same thread
  • The same-thread check using Arc::ptr_eq avoids unnecessary unpark syscalls since the thread will check has_pending_work before parking

The memory ordering is correct, with Release on the swap synchronizing with Acquire loads in has_pending_work.

tokio/tests/rt_metrics.rs (1)

155-155: LGTM - Parity check correctly reflects optimized parking behavior.

The change from a minimum count assertion to a parity check is appropriate given the redundant unpark optimization in the scheduler. While the exact count is now less predictable, parks and unparks should always be paired, resulting in an even count. This maintains test validity while accommodating the optimization.

tokio/tests/rt_basic.rs (1)

460-498: LGTM - Well-designed test for before_park yield behavior.

This test effectively validates that the runtime detects and handles deferred work (via yield_now) when a before_park hook executes, which is a key aspect of the PR objective. The use of a custom ArcWake implementation backed by Notify is the correct pattern for creating a waker outside the runtime's normal task context.

The manual polling of yield_now within on_thread_park exercises the exact scenario the optimization addresses: the hook creates pending work, and the runtime should detect this via has_pending_work and avoid parking.

tokio/tests/rt_common_before_park.rs (3)

10-38: LGTM - Validates wake detection before parking.

This test correctly verifies that when on_thread_park issues a wake signal (via notify_one()), the runtime's has_pending_work check detects it and prevents parking. The guard using AtomicBool ensures the hook runs only once, avoiding potential busy loops if something goes wrong.


40-70: LGTM - Validates spawn detection before parking.

This test confirms that spawning a task from on_thread_park is detected by the has_pending_work check. The comment at line 65 correctly notes this would deadlock if the runtime parked, making this an important regression test for the optimization.


72-92: LGTM - Validates cross-thread wake behavior.

This test verifies the cross-thread wake path in the optimized wake_by_ref implementation, ensuring that wakes from other threads properly unpark the runtime even with the conditional unpark logic.

Note: The 5ms sleep at line 81 introduces a timing dependency. While generally sufficient, consider if a more deterministic synchronization approach (e.g., signaling when parking starts) would improve test reliability on heavily loaded CI systems.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link

claude bot commented Jan 4, 2026

Pull Request Review - PR #44

Summary

This PR addresses an important issue in the current_thread scheduler where redundant unpark operations were being performed. The changes introduce two key optimizations:

  1. Avoids redundant unparks: When a task wakes itself while already running on the current thread, it skips the unpark operation
  2. Handles before_park hooks properly: Ensures the runtime doesn't park if the before_park hook populates the deferred queue

Code Quality & Best Practices ✅

Strengths:

  • Well-structured refactoring: The introduction of has_pending_work() helper at mod.rs:417-419 improves code clarity and reduces duplication
  • Proper atomic operations: Correctly uses swap(true, Release) to check-and-set atomically, and load(Acquire) for reading woken state
  • Memory ordering: The use of Acquire ordering in has_pending_work() properly synchronizes with the Release in wake_by_ref()
  • Good encapsulation: The logic for checking pending work is now centralized in one function

Observations:

  • The Arc::ptr_eq check at mod.rs:731 is a clever optimization to detect self-wakes
  • The early return pattern with the swap operation prevents redundant unparks effectively

Correctness Analysis ✅

Runtime Logic:

  1. before_park interaction (lines 382-394): The change from core.tasks.is_empty() to !self.has_pending_work(&core) is correct and more comprehensive. It now checks:

    • Task queue (!core.tasks.is_empty())
    • Deferred queue (!self.defer.is_empty())
    • Woken flag (self.handle.shared.woken.load(Acquire))
  2. wake_by_ref optimization (lines 726-737): The logic correctly:

    • Uses atomic swap to check if already woken (prevents redundant work)
    • Only calls driver.unpark() when not waking the current thread
    • Handles cross-thread wakes by falling through to the unpark call

Potential Issues:

⚠️ Missing import in test file: In rt_basic.rs:456-498, the test uses Context and Pin but the imports at the top of the file need verification. Looking at the existing imports (lines 1-13), Context and Pin are already imported, so this is actually fine.

⚠️ Test clarity: The test before_park_yields() at rt_basic.rs:460 creates a waker manually using futures::task::ArcWake and polls yield_now() inside the park hook. This is creative but quite complex - it might be hard to maintain. However, it does test the exact scenario described in the commit message.

Performance Considerations ✅

Improvements:

  1. Reduced syscalls: By avoiding unpark when waking the current thread, this eliminates unnecessary system calls (e.g., eventfd writes on Linux)
  2. Better busy-loop prevention: The enhanced has_pending_work() check prevents parking when work is available, reducing latency
  3. Minimal overhead: The Arc::ptr_eq check is a simple pointer comparison, adding negligible cost

Memory Ordering:

  • The Acquire load in has_pending_work() properly pairs with the Release store in wake_by_ref()
  • No unnecessary synchronization - the ordering is exactly what's needed

Security Concerns ✅

No security issues identified. The changes:

  • Don't introduce data races (proper atomic operations)
  • Don't expose internal state inappropriately
  • Don't change the API surface

Test Coverage ⭐

Excellent Coverage:

The PR includes three comprehensive test files:

  1. rt_basic.rs: Tests before_park hook with yield_now() interaction

  2. rt_common_before_park.rs (new file):

    • before_park_wakes_block_on_task(): Tests notification during park
    • before_park_spawns_task(): Tests spawning in the hook
    • wake_from_other_thread_block_on(): Tests cross-thread wakes
  3. rt_metrics.rs: Updated assertion to check park/unpark count parity instead of minimum threshold

Test Quality:

  • Tests cover the key scenarios: self-wake, spawning in hooks, cross-thread wakes
  • Good use of AtomicBool to verify hooks actually ran
  • The metrics test change (line 155) from assert!(2 <= count) to assert_eq!(0, count % 2) is clever - it verifies park/unpark operations are balanced

Minor Suggestions:

  • Consider adding a test that verifies the unpark is actually skipped (perhaps using metrics)
  • The sleep in wake_from_other_thread_block_on() (line 81) is a timing assumption - consider if there's a more deterministic approach

Additional Notes

Import Addition:

The addition of Acquire to the imports at mod.rs:15 is necessary and correct.

Metrics Test Change:

The change in rt_metrics.rs:155 from assert!(2 <= metrics.worker_park_unpark_count(0)) to assert_eq!(0, metrics.worker_park_unpark_count(0) % 2) is actually a better assertion because:

  • It verifies park/unpark operations are balanced (always come in pairs)
  • It's less brittle than checking a minimum threshold
  • With the optimization, the exact count might vary, but parity should always hold

Verdict: ✅ Approve

This is a solid optimization with:

  • Clear performance benefits (reduced syscalls)
  • Correct implementation (proper atomics and synchronization)
  • Excellent test coverage
  • Good code organization (the has_pending_work() helper)

The changes are well-thought-out and address a real inefficiency in the scheduler. The commit messages clearly explain the rationale. Great work!

Nits (non-blocking):

  • Consider documenting why Arc::ptr_eq is safe/correct in a comment (it's checking if we're waking ourselves)
  • The test before_park_yields() is complex - a doc comment explaining what it's testing would help future maintainers

Review generated by Claude Code

@augmentcode
Copy link

augmentcode bot commented Jan 4, 2026

🤖 Augment PR Summary

Summary: This PR reduces redundant wake-ups in the current_thread scheduler by coalescing wake notifications and avoiding unnecessary unpark() calls.

Changes:

  • Teach the scheduler to treat “pending work” as more than just non-empty local task queues (includes deferred wakers and the shared.woken flag).
  • Update parking logic to re-check this broader pending-work condition after before_park runs, preventing deadlocks when before_park wakes the block_on task.
  • Update Handle::wake_by_ref to only call driver.unpark() on the first transition to woken=true, and to skip unparking when waking from the same current-thread runtime.
  • Add integration tests covering before_park waking block_on, spawning in before_park, and cross-thread wake behavior.
  • Relax/adjust runtime metrics assertions to reflect the reduced/uncoalesced unpark behavior.

Technical Notes: The new logic relies on the existing woken flag (cleared via reset_woken()) to ensure the block_on future is re-polled without forcing redundant thread unparks.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

// This check will fail if `before_park` spawns a task for us to run
// instead of parking the thread
if core.tasks.is_empty() {
if !self.has_pending_work(&core) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above this if still mentions only before_park spawning a task, but has_pending_work now also covers deferred wakers and the shared.woken flag; consider updating it so it matches the new behavior.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:documentation; feedback:The Augment AI reviewer is correct! The simple condition has been replaced with a call to a method that makes checks three conditions, so the comment above is no more valid and it needs to be updated. Prevents making the comment obsolete and incomplete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments