Skip to content

Conversation

@youknowone
Copy link
Owner

@youknowone youknowone commented Jan 1, 2026

Summary by CodeRabbit

  • New Features

    • Added comprehensive garbage collection API including collection control, threshold management, statistics, and object inspection capabilities.
    • Added thread exception hook for handling uncaught exceptions in threads.
  • Bug Fixes

    • Improved circular reference detection and resolution.
    • Enhanced reference counting and lifetime management.

✏️ Tip: You can customize this high-level summary in your review settings.

youknowone and others added 21 commits January 1, 2026 23:34
- Fix clippy: use is_multiple_of() in ebr/internal.rs
- Fix unsafe block requirement in ebr/sync/list.rs
- Add Send + Sync impl for GcState to allow static storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Jan 1, 2026

📝 Walkthrough

Walkthrough

Implements a complete Epoch-Based Reclamation (EBR) garbage collection system alongside a CPython-compatible generational GC. Introduces lock-free synchronization primitives, redesigns reference counting to use 64-bit state machines, adds GC lifecycle hooks to PyObject, and integrates coarse-grained pinning into thread execution and interpreter finalization for cycle detection and safe deferred cleanup.

Changes

Cohort / File(s) Summary
Dependency Additions
Cargo.toml, crates/common/Cargo.toml
Added memoffset and atomic dependencies for bit manipulation and atomic EBR operations
EBR Core Module
crates/common/src/ebr/collector.rs, crates/common/src/ebr/default.rs, crates/common/src/ebr/epoch.rs, crates/common/src/ebr/guard.rs
New epoch-based reclamation types: Collector, LocalHandle, Guard, Epoch with pinning and unpinning semantics; thread-local cs() critical section entry; global epoch tracking and advancement
EBR Infrastructure
crates/common/src/ebr/deferred.rs, crates/common/src/ebr/internal.rs, crates/common/src/ebr/pointers.rs
Closure wrapper Deferred for inline/heap storage; Global and Local GC state with thread-local bags and atomic queue; tagged pointer abstraction Tagged<T>, RawAtomic, and RawShared for safe EBR pointer operations
Lock-Free Synchronization
crates/common/src/ebr/sync/list.rs, crates/common/src/ebr/sync/once_lock.rs, crates/common/src/ebr/sync/queue.rs
Intrusive linked list with lock-free traversal; Michael-Scott queue; lazy-initialization primitive; all coordinated with EBR Guard
Module Integration
crates/common/src/ebr/mod.rs, crates/common/src/lib.rs
New ebr module with re-exports of public API (default, epoch, guard, pointers)
Reference Counting Redesign
crates/common/src/refcount.rs
Replaced single-counter design with 64-bit atomic state machine; added State, PyState, Modular types; exposed deferred-drop API via EBR integration; thread-local DEFERRED_QUEUE for deferred cleanup
VM Garbage Collection
crates/vm/src/gc_state.rs, crates/vm/src/lib.rs
New GcState singleton with 3 generations + permanent generation; per-generation tracking, thresholds, statistics; collection orchestration with resurrection and finalization handling; thread-safe via RwLock and PyMutex
PyObject GC Integration
crates/vm/src/object/core.rs, crates/vm/src/object/traverse.rs, crates/vm/src/object/traverse_object.rs
Added GC tracking flag, weak-ref lifecycle management during finalization, edge extraction via pop_edges hook; new try_pop_edges_obj, gc_get_referents, gc_clear_weakrefs, gc_pop_edges methods; HAS_POP_EDGES capability in Traverse trait
Builtin Types Traverse/Pop-Edges
crates/vm/src/builtins/dict.rs, crates/vm/src/builtins/list.rs, crates/vm/src/builtins/tuple.rs, crates/vm/src/builtins/function.rs, crates/vm/src/builtins/str.rs, crates/vm/src/builtins/super.rs
Added manual traverse + pop_edges to mutable containers; PyFunction clears cycles via name/qualname/closures; PySuper relaxes type coercion from PyTypeRef to PyObjectRef with runtime downcast
Stdlib Module GC
crates/stdlib/src/gc.rs, crates/stdlib/src/ssl.rs, crates/stdlib/src/fcntl.rs
Comprehensive gc module API: enable/disable, collect, thresholds, per-gen stats, object enumeration, callbacks, garbage/finalized lists; _SSLSocket marked with traverse; fcntl.ioctl widened request parameter to i64 with bit-pattern normalization
Derive Macro Updates
crates/derive-impl/src/pyclass.rs, crates/derive-impl/src/pystructseq.rs, crates/derive-impl/src/util.rs
Unified MaybeTraverse to support both traverse and pop_edges attributes; conditional Traverse derivation; added "pop_edges" to allowed class attribute names
Thread Lifecycle
crates/vm/src/stdlib/thread.rs, crates/vm/src/vm/thread.rs
New ExceptHookArgs type and _excepthook handler for uncaught exceptions; thread-local EBR_GUARD with ensure_pinned, reactivate_guard, drop_guard for coarse-grained epoch control
Interpreter Finalization
crates/vm/src/vm/interpreter.rs, crates/vm/src/vm/mod.rs, crates/vm/src/vm/context.rs
Added finalize_modules() to clear module references; two GC passes during shutdown; finalizing flag; threading._shutdown() integration; gc_callbacks and gc_garbage fields in Context; frame-boundary guard reactivation for epoch advancement
Dictionary & Warning Updates
crates/vm/src/dict_inner.rs, crates/vm/src/warn.rs
New pop_all_entries() method for dict cycle breaking; improved fallback frame/globals resolution for warnings during shutdown

Sequence Diagram(s)

sequenceDiagram
    participant Thread as Thread (cs entry)
    participant Guard as EBR Guard
    participant Local as Local Handle
    participant Global as Global Epoch
    participant Queue as Deferred Queue
    
    Thread->>Guard: pin() → Guard
    Guard->>Local: acquire_handle()
    Local->>Local: advance local epoch
    rect rgb(220, 240, 255)
        Thread->>Guard: defer_unchecked(closure)
        Guard->>Local: defer(Deferred)
        Local->>Local: push to thread-local bag
    end
    Thread->>Guard: (critical section)
    Guard->>Local: unpin()
    Local->>Local: push_to_global()
    rect rgb(200, 220, 200)
        Local->>Queue: seal bag with epoch
        Global->>Queue: try_advance(current_epoch)
        Global->>Queue: collect() - scan sealed bags
        Queue->>Queue: execute deferred closures
    end
Loading
sequenceDiagram
    participant VM as VM (collect)
    participant GC as GcState
    participant Obj as PyObject
    participant Heap as Heap
    
    VM->>GC: collect(generation)
    GC->>GC: gather tracked objects
    GC->>Obj: traverse() for gc_refs
    Obj->>Obj: count strong refs
    GC->>GC: subtract internal refs
    rect rgb(255, 240, 200)
        GC->>Obj: is_finalized?
        Obj->>Obj: invoke __del__ if not
        Obj->>Obj: mark_finalized()
    end
    rect rgb(255, 220, 220)
        GC->>Obj: gc_pop_edges()
        Obj->>GC: break cycles
        GC->>GC: clear weak refs
    end
    GC->>GC: promote survivors
    GC->>Heap: dealloc unreachable
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes


Poem

🐰 A rabbit hops through epochs vast,
Where guards now pin to hold them fast,
With deferred drops in queues they flow,
And cycles break where pointers go—
Lock-free and free from mem'ry's thrall! 🌿

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'gc' is vague and does not clearly convey the scope of changes. While the PR implements garbage collection infrastructure, the single-word title lacks specificity about what was actually added. Use a more descriptive title such as 'Implement epoch-based garbage collection (EBR) with generational support' or 'Add EBR garbage collector and GC infrastructure' to clearly summarize the main changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (16)
crates/vm/src/dict_inner.rs (1)

712-721: pop_all_entries keeps Dict invariants; consider reuse of clear semantics

The method correctly:

  • Obtains a mutable handle via get_mut() (no lock, requires &mut self).
  • Resets used/filled and marks all indices FREE before draining entries.
  • Returns an iterator of (PyObjectRef, T) for GC to consume, leaving the dict logically empty.

One behavioral difference vs clear() is that indices capacity is preserved instead of being reset to length 8. That’s probably fine for GC-focused use, but if this gets used on very large dicts in hot paths you might want to either reuse clear()’s resizing or document this capacity retention explicitly.

crates/common/src/ebr/sync/once_lock.rs (1)

10-17: Consider using Once::is_completed() instead of a separate AtomicBool.

The comment on line 12 mentions Once::is_completed requires Rust 1.43, but the workspace MSRV is 1.89.0 (per root Cargo.toml line 134). Using Once::is_completed() would eliminate the is_initialized field and the associated synchronization overhead.

🔎 Proposed simplification
 pub(crate) struct OnceLock<T> {
     once: Once,
-    // Once::is_completed requires Rust 1.43, so use this to track of whether they have been initialized.
-    is_initialized: AtomicBool,
     value: UnsafeCell<MaybeUninit<T>>,
-    // Unlike std::sync::OnceLock, we don't need PhantomData here because
-    // we don't use #[may_dangle].
 }

Then replace self.is_initialized() calls with self.once.is_completed().

crates/vm/src/object/core.rs (1)

298-305: Subtle double-decrement prevention logic - add clarifying comment.

The interaction between already_cleared and decrement_ref_count prevents double decrements when both GC and drop_slow_inner clear weakrefs. However, the logic is subtle and could benefit from a more explicit comment explaining the scenario.

             // Only decrement ref_count if requested AND not already cleared
+            // This prevents double-decrement when:
+            // 1. GC calls clear_for_gc (decrement_ref_count=false)
+            // 2. Later, drop_slow_inner calls clear (decrement_ref_count=true)
+            //    but obj is already None, so we skip the decrement
             if decrement_ref_count && !already_cleared {
crates/stdlib/src/gc.rs (1)

261-266: Silent error ignoring in callback invocation.

Errors from set_item and callback.call are silently discarded. While CPython also handles callback errors gracefully, completely ignoring them could make debugging difficult.

Consider logging errors or using run_unraisable for callback failures, similar to how __del__ errors are handled in core.rs.

-        let _ = info.set_item("generation", vm.ctx.new_int(generation).into(), vm);
-        let _ = info.set_item("collected", vm.ctx.new_int(collected).into(), vm);
-        let _ = info.set_item("uncollectable", vm.ctx.new_int(uncollectable).into(), vm);
+        // These set_item calls should not fail for simple int values
+        info.set_item("generation", vm.ctx.new_int(generation).into(), vm).ok();
+        info.set_item("collected", vm.ctx.new_int(collected).into(), vm).ok();
+        info.set_item("uncollectable", vm.ctx.new_int(uncollectable).into(), vm).ok();

         for callback in callbacks {
-            let _ = callback.call((phase_str.clone(), info.clone()), vm);
+            if let Err(e) = callback.call((phase_str.clone(), info.clone()), vm) {
+                // Handle callback errors gracefully, similar to __del__ errors
+                vm.run_unraisable(e, Some("gc callback".to_owned()), callback);
+            }
         }
Cargo.toml (1)

208-209: Both dependencies are necessary for the EBR implementation.

The memoffset crate is actively used in crates/common/src/ebr/internal.rs (lines 45, 575, 582). While std::mem::offset_of! is available (MSRV 1.89.0 exceeds its 1.77 stabilization), consolidating on the standard library version could be a future refactoring. The atomic crate is essential—it provides Atomic<T> for arbitrary types, which std::sync::atomic doesn't offer (only concrete types like AtomicUsize).

crates/common/src/ebr/mod.rs (1)

23-23: Verify if internal should be publicly exposed.

The internal module is declared as pub mod internal, which exposes implementation details. This might be intentional for testing or advanced use cases within the rustpython workspace, but it's worth confirming that this is the desired visibility rather than pub(crate).

🔎 Consider restricting visibility if not needed externally

If the internal module is only used within this crate:

-pub mod internal;
+pub(crate) mod internal;

If it needs to be accessible to other rustpython crates but not external users, consider using pub(crate) or visibility modifiers like pub(in crate::ebr).

crates/vm/src/vm/mod.rs (1)

981-1052: Consider making the stdlib module list more maintainable.

The hardcoded list of stdlib module names (lines 1006-1029) and the path-based detection (pylib, Lib) are fragile:

  1. The stdlib list may become stale as modules are added/removed
  2. Path detection may fail on non-standard installations or alternative stdlib locations

Consider one of these alternatives:

  • Use a centralized constant or configuration for stdlib module names
  • Mark stdlib modules with a metadata flag during import
  • Check if the module originated from frozen modules or builtin inits

That said, this is a reasonable first implementation for breaking cycles during shutdown.

crates/vm/src/builtins/function.rs (2)

53-57: Consider traversing name and qualname fields.

The traverse method visits type_params, annotations, module, and doc, but not name and qualname. Since these can be PyStr subclasses (as noted in pop_edges), they may also participate in reference cycles and should be traversed for consistency.

🔎 Proposed addition to traverse
         self.type_params.lock().traverse(tracer_fn);
         self.annotations.lock().traverse(tracer_fn);
         self.module.lock().traverse(tracer_fn);
         self.doc.lock().traverse(tracer_fn);
+        self.name.lock().traverse(tracer_fn);
+        self.qualname.lock().traverse(tracer_fn);
     }

67-74: try_lock failure silently skips clearing defaults.

If try_lock() fails (returns None), the defaults and kwdefaults are not cleared. While this is likely intentional for GC safety (avoiding deadlock), it could lead to reference cycles not being broken if the lock is contended during collection. Consider adding a debug assertion or log to track this case.

crates/common/src/ebr/epoch.rs (1)

33-38: Verify wrapping_sub correctness for edge cases.

The optimization in wrapping_sub relies on the LSB difference being discarded by the right shift. The comment explains this, but consider adding a unit test to verify behavior at epoch boundaries (e.g., near usize::MAX).

crates/vm/src/stdlib/thread.rs (2)

445-469: Constructor uses indexing after length validation - safe but consider bounds-checked access.

The constructor validates seq.len() == 4 before indexing, so this is safe. However, using get() with pattern matching would be more idiomatic Rust and defensive against future refactoring.

🔎 Optional improvement using pattern matching
-            if seq.len() != 4 {
-                return Err(vm.new_type_error(format!(
-                    "_ExceptHookArgs expected 4 arguments, got {}",
-                    seq.len()
-                )));
-            }
-            Ok(Self {
-                exc_type: seq[0].clone(),
-                exc_value: seq[1].clone(),
-                exc_traceback: seq[2].clone(),
-                thread: seq[3].clone(),
-            })
+            match <[_; 4]>::try_from(seq) {
+                Ok([exc_type, exc_value, exc_traceback, thread]) => Ok(Self {
+                    exc_type,
+                    exc_value,
+                    exc_traceback,
+                    thread,
+                }),
+                Err(seq) => Err(vm.new_type_error(format!(
+                    "_ExceptHookArgs expected 4 arguments, got {}",
+                    seq.len()
+                ))),
+            }

499-522: Deep nesting in stderr fallback logic reduces readability.

The nested if let / if / else chain for getting stderr is hard to follow. Consider extracting to a helper function or using early returns.

🔎 Suggested refactor using a helper closure
+        // Helper to get stderr, with fallback to thread._stderr
+        let get_stderr = || -> Option<crate::PyObjectRef> {
+            if let Ok(stderr) = vm.sys_module.get_attr("stderr", vm) {
+                if !vm.is_none(&stderr) {
+                    return Some(stderr);
+                }
+            }
+            let thread = thread.as_ref()?;
+            if vm.is_none(thread) {
+                return None;
+            }
+            let thread_stderr = thread.get_attr("_stderr", vm).ok()?;
+            if vm.is_none(&thread_stderr) {
+                return None;
+            }
+            Some(thread_stderr)
+        };
+
-        // Get stderr - fall back to thread._stderr if sys.stderr is None
-        let stderr = match vm.sys_module.get_attr("stderr", vm) {
-            Ok(stderr) if !vm.is_none(&stderr) => stderr,
-            _ => {
-                // ... deeply nested logic ...
-            }
-        };
+        let stderr = match get_stderr() {
+            Some(s) => s,
+            None => return Ok(()),
+        };
crates/common/src/ebr/internal.rs (1)

55-58: Static mutable variables are unsafe and difficult to use correctly.

MAX_OBJECTS and MANUAL_EVENTS_BETWEEN_COLLECT are static mut, requiring unsafe blocks for every access. Consider using AtomicUsize or once_cell::sync::Lazy for safe interior mutability, especially if these might be configurable at runtime.

🔎 Suggested safer alternative
-/// Maximum number of objects a bag can contain.
-static mut MAX_OBJECTS: usize = 64;
-
-#[allow(dead_code)]
-static mut MANUAL_EVENTS_BETWEEN_COLLECT: usize = 64;
+use std::sync::atomic::AtomicUsize;
+
+/// Maximum number of objects a bag can contain.
+static MAX_OBJECTS: AtomicUsize = AtomicUsize::new(64);
+
+#[allow(dead_code)]
+static MANUAL_EVENTS_BETWEEN_COLLECT: AtomicUsize = AtomicUsize::new(64);

Then update usages:

-Bag(Vec::with_capacity(unsafe { MAX_OBJECTS }))
+Bag(Vec::with_capacity(MAX_OBJECTS.load(Ordering::Relaxed)))
crates/vm/src/gc_state.rs (2)

406-418: Holding read locks during entire collection may cause contention.

Lines 406-408 acquire read locks on all generation sets and hold them until line 513/523. During this time, track_object and untrack_object from other threads will block on write locks.

For a GC that may take significant time, this could cause allocation pauses. Consider copying the object set and releasing locks earlier.

🔎 Alternative: copy and release locks early
-        let gen_locks: Vec<_> = (0..=generation)
-            .filter_map(|i| self.generation_objects[i].read().ok())
-            .collect();
-
         let mut collecting: HashSet<GcObjectPtr> = HashSet::new();
-        for gen_set in &gen_locks {
-            for &ptr in gen_set.iter() {
+        for i in 0..=generation {
+            if let Ok(gen_set) = self.generation_objects[i].read() {
+                for &ptr in gen_set.iter() {
+                    let obj = unsafe { ptr.0.as_ref() };
+                    if obj.strong_count() > 0 {
+                        collecting.insert(ptr);
+                    }
+                }
+            } // Lock released here
+        }

This trades off snapshot consistency for reduced lock contention.


690-712: promote_survivors has O(n*m) complexity - can be optimized.

For each survivor, the code iterates through all generations 0..=from_gen to find which one contains it (lines 700-710). Since objects should be in exactly one generation, this is inefficient.

A more efficient approach would track which generation each object is in, or iterate the generation sets instead of survivors.

🔎 Sketch of more efficient approach
fn promote_survivors(&self, from_gen: usize, survivors: &HashSet<GcObjectPtr>) {
    if from_gen >= 2 {
        return;
    }
    let next_gen = from_gen + 1;
    
    // Process each generation once instead of per-survivor
    for gen_idx in 0..=from_gen {
        if let Ok(mut gen_set) = self.generation_objects[gen_idx].write() {
            let to_promote: Vec<_> = gen_set.intersection(survivors).copied().collect();
            for ptr in to_promote {
                gen_set.remove(&ptr);
                if let Ok(mut next_set) = self.generation_objects[next_gen].write() {
                    next_set.insert(ptr);
                }
            }
        }
    }
}
crates/common/src/ebr/collector.rs (1)

70-96: LocalHandle safely stores a raw pointer via handle_count tracking—add safety documentation to explain the invariants.

The unsafe code is sound: Local is heap-allocated via RawShared::from_owned() and lives longer than any LocalHandle due to the handle_count mechanism (initialized to 1, incremented/decremented on handle creation/drop, and only finalized when both handle_count and guard_count reach 0). The design correctly follows the epoch-based reclamation pattern.

Add a safety comment to LocalHandle documenting these invariants so maintainers understand why the unsafe code is safe.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b19312b and 9bb9097.

⛔ Files ignored due to path filters (11)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/support/__init__.py is excluded by !Lib/**
  • Lib/test/test_builtin.py is excluded by !Lib/**
  • Lib/test/test_dict.py is excluded by !Lib/**
  • Lib/test/test_gc.py is excluded by !Lib/**
  • Lib/test/test_generators.py is excluded by !Lib/**
  • Lib/test/test_subprocess.py is excluded by !Lib/**
  • Lib/test/test_super.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
  • Lib/test/test_weakref.py is excluded by !Lib/**
  • Lib/test/test_with.py is excluded by !Lib/**
📒 Files selected for processing (40)
  • Cargo.toml
  • crates/common/Cargo.toml
  • crates/common/src/ebr/collector.rs
  • crates/common/src/ebr/default.rs
  • crates/common/src/ebr/deferred.rs
  • crates/common/src/ebr/epoch.rs
  • crates/common/src/ebr/guard.rs
  • crates/common/src/ebr/internal.rs
  • crates/common/src/ebr/mod.rs
  • crates/common/src/ebr/pointers.rs
  • crates/common/src/ebr/sync/list.rs
  • crates/common/src/ebr/sync/mod.rs
  • crates/common/src/ebr/sync/once_lock.rs
  • crates/common/src/ebr/sync/queue.rs
  • crates/common/src/lib.rs
  • crates/common/src/refcount.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/derive-impl/src/util.rs
  • crates/stdlib/src/fcntl.rs
  • crates/stdlib/src/gc.rs
  • crates/stdlib/src/ssl.rs
  • crates/vm/src/builtins/dict.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/list.rs
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/builtins/super.rs
  • crates/vm/src/builtins/tuple.rs
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/gc_state.rs
  • crates/vm/src/lib.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/object/traverse.rs
  • crates/vm/src/object/traverse_object.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
  • crates/vm/src/warn.rs
🧰 Additional context used
🧬 Code graph analysis (24)
crates/vm/src/vm/context.rs (2)
crates/vm/src/object/core.rs (2)
  • new_ref (1290-1307)
  • default (342-344)
crates/vm/src/builtins/list.rs (1)
  • new_ref (91-93)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
  • gc_state (742-744)
crates/vm/src/builtins/dict.rs (2)
crates/vm/src/builtins/list.rs (2)
  • traverse (62-64)
  • pop_edges (66-73)
crates/vm/src/object/traverse.rs (9)
  • traverse (34-34)
  • traverse (43-45)
  • traverse (49-51)
  • traverse (55-55)
  • traverse (60-64)
  • traverse (72-76)
  • traverse (84-88)
  • traverse (96-100)
  • pop_edges (39-39)
crates/common/src/ebr/default.rs (4)
crates/common/src/ebr/collector.rs (6)
  • new (30-32)
  • global_epoch (41-43)
  • drop (91-95)
  • drop (248-250)
  • drop (313-315)
  • drop (393-395)
crates/common/src/ebr/epoch.rs (1)
  • new (90-93)
crates/common/src/ebr/sync/once_lock.rs (2)
  • new (25-31)
  • drop (98-103)
crates/common/src/ebr/guard.rs (1)
  • drop (150-154)
crates/vm/src/warn.rs (6)
crates/vm/src/vm/mod.rs (1)
  • modules (990-990)
crates/vm/src/builtins/type.rs (1)
  • __dict__ (963-965)
crates/vm/src/stdlib/builtins.rs (1)
  • globals (421-423)
crates/vm/src/object/core.rs (1)
  • dict (732-734)
crates/vm/src/builtins/dict.rs (1)
  • dict (782-782)
crates/vm/src/frame.rs (1)
  • dict (801-801)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/builtins/str.rs (2)
  • try_traverse (1932-1934)
  • try_pop_edges (1936-1938)
crates/vm/src/object/traverse.rs (2)
  • try_traverse (19-19)
  • try_pop_edges (21-21)
crates/stdlib/src/fcntl.rs (1)
extra_tests/snippets/stdlib_ctypes.py (1)
  • c_ulong (94-95)
crates/common/src/ebr/guard.rs (2)
crates/common/src/ebr/internal.rs (4)
  • defer (349-359)
  • new (68-70)
  • new (160-166)
  • pin (392-454)
crates/common/src/ebr/deferred.rs (5)
  • mem (33-33)
  • mem (34-34)
  • mem (37-37)
  • mem (37-37)
  • new (32-70)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
  • debug_obj (87-93)
  • drop_dealloc_obj (84-86)
  • try_pop_edges_obj (103-106)
  • try_trace_obj (96-100)
crates/vm/src/builtins/function.rs (2)
crates/vm/src/builtins/list.rs (2)
  • traverse (62-64)
  • pop_edges (66-73)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/common/src/ebr/sync/queue.rs (1)
crates/common/src/ebr/guard.rs (1)
  • unprotected (173-177)
crates/common/src/ebr/collector.rs (2)
crates/common/src/ebr/internal.rs (9)
  • fmt (117-119)
  • default (101-103)
  • new (68-70)
  • new (160-166)
  • register (300-323)
  • pin (392-454)
  • drop (107-112)
  • release_handle (517-526)
  • collector (333-335)
crates/common/src/ebr/guard.rs (2)
  • fmt (158-160)
  • drop (150-154)
crates/vm/src/vm/thread.rs (1)
crates/common/src/ebr/default.rs (1)
  • cs (24-26)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
  • ensure_pinned (30-36)
  • drop_guard (55-59)
crates/stdlib/src/ssl.rs (2)
crates/vm/src/builtins/function.rs (1)
  • traverse (47-58)
crates/vm/src/builtins/dict.rs (1)
  • traverse (48-50)
crates/common/src/ebr/mod.rs (5)
crates/common/src/ebr/internal.rs (2)
  • collector (333-335)
  • default (101-103)
crates/common/src/ebr/default.rs (1)
  • collector (11-15)
crates/common/src/ebr/sync/list.rs (1)
  • default (133-137)
crates/common/src/ebr/collector.rs (1)
  • default (21-25)
crates/common/src/ebr/pointers.rs (1)
  • default (29-31)
crates/vm/src/builtins/tuple.rs (3)
crates/vm/src/builtins/list.rs (2)
  • traverse (62-64)
  • pop_edges (66-73)
crates/vm/src/builtins/dict.rs (2)
  • traverse (48-50)
  • pop_edges (52-58)
crates/vm/src/object/traverse.rs (10)
  • traverse (34-34)
  • traverse (43-45)
  • traverse (49-51)
  • traverse (55-55)
  • traverse (60-64)
  • traverse (72-76)
  • traverse (84-88)
  • traverse (96-100)
  • traverse (105-112)
  • pop_edges (39-39)
crates/vm/src/object/traverse.rs (4)
crates/vm/src/builtins/str.rs (2)
  • try_traverse (1932-1934)
  • try_pop_edges (1936-1938)
crates/vm/src/builtins/list.rs (1)
  • pop_edges (66-73)
crates/vm/src/builtins/dict.rs (1)
  • pop_edges (52-58)
crates/vm/src/builtins/tuple.rs (1)
  • pop_edges (47-51)
crates/vm/src/builtins/str.rs (1)
crates/vm/src/object/traverse.rs (2)
  • try_traverse (19-19)
  • try_pop_edges (21-21)
crates/common/src/ebr/epoch.rs (1)
crates/common/src/ebr/internal.rs (2)
  • new (68-70)
  • new (160-166)
crates/common/src/refcount.rs (2)
crates/common/src/ebr/default.rs (2)
  • cs (24-26)
  • global_epoch (44-46)
crates/common/src/ebr/collector.rs (1)
  • global_epoch (41-43)
crates/vm/src/gc_state.rs (3)
crates/common/src/refcount.rs (2)
  • default (244-246)
  • with_deferred_drops (396-413)
crates/common/src/ebr/default.rs (1)
  • cs (24-26)
crates/common/src/ebr/guard.rs (1)
  • drop (150-154)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
  • reactivate_guard (44-50)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (18)
  • gc_state (742-744)
  • enable (164-166)
  • disable (169-171)
  • collect (368-370)
  • get_threshold (184-190)
  • set_threshold (64-66)
  • set_threshold (193-201)
  • get_count (204-210)
  • get_debug (174-176)
  • set_debug (179-181)
  • get_stats (213-219)
  • stats (68-75)
  • get_objects (297-338)
  • new (44-54)
  • new (134-156)
  • freeze (720-726)
  • unfreeze (729-732)
  • get_freeze_count (715-717)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Collect benchmark data
  • GitHub Check: Collect what is left data
  • GitHub Check: Collect code coverage data
  • GitHub Check: Collect regression test data
🔇 Additional comments (73)
crates/derive-impl/src/util.rs (1)

375-375: LGTM! The pop_edges attribute is properly integrated throughout the codebase.

The addition of "pop_edges" to ALLOWED_NAMES follows the pattern established by other attributes like "traverse" and enables #[pyclass(pop_edges)] to pass validation. The macro implementation already checks for this attribute using _has_key("pop_edges") and generates the appropriate HAS_POP_EDGES constant and try_pop_edges method based on its presence. Multiple builtin types (dict, list, tuple, function) already use this attribute with corresponding pop_edges method implementations, and the GC system successfully leverages it to extract child references for circular reference resolution.

crates/stdlib/src/fcntl.rs (1)

95-95: Conversion logic is correct; i64 choice is reasonable for the Python API.

The truncation-then-zero-extend pattern correctly handles both positive values and negative i32 values passed as i64 (preserving their bit patterns). The i64 parameter type is appropriate for a Python API—it accepts a wider range of Python integer inputs while still safely truncating to u32 before passing to the syscall, which aligns with CPython's approach of accepting platform-sized integers for ioctl request codes. No breaking changes to internal Rust callers were found.

crates/common/Cargo.toml (1)

34-38: EBR dependency block looks consistent with existing workspace patterns

The added crossbeam-utils, memoffset, scopeguard, and atomic workspace dependencies are grouped clearly under an EBR comment and match the style of existing entries; no issues from the manifest side.

crates/derive-impl/src/pyclass.rs (1)

608-643: MaybeTraverse generation for traverse/pop_edges looks consistent with GC design

The new MaybeTraverse impl:

  • Sets IS_TRACE based on has_traverse and only calls Traverse::traverse when requested.
  • Exposes a HAS_POP_EDGES flag and routes try_pop_edges to Traverse::pop_edges only when pop_edges is present.
  • Leaves both hooks as no‑ops when the attributes are absent, so all #[pyclass] types implement MaybeTraverse uniformly.

This matches the documented cases (manual vs derived traverse, optional pop_edges) and keeps the generated code straightforward; aside from the _optional_str error handling above, the structure looks solid.

Also applies to: 699-706

crates/stdlib/src/ssl.rs (1)

2296-2345: GC traversal configuration for _SSLSocket looks appropriate

Marking _SSLSocket as #[pyclass(..., traverse)] and selectively skipping fields with #[pytraverse(skip)] aligns well with what actually holds Python references:

  • Traced: sock, context, owner, session, incoming_bio, outgoing_bio, pending_context — all contain PyObjectRef/PyRef and should participate in GC.
  • Skipped: server_side, server_hostname, connection, handshake_done, session_was_reused, verified_chain, sni_state, client_hello_buffer, shutdown_state, deferred_cert_error — these are pure Rust state or TLS/certificate structures with no embedded Python objects.

This should give the collector visibility into real Python edges (including the _SSLSocket ↔ owner/session relationships) without forcing Traverse impls on the rustls plumbing types.

crates/common/src/ebr/sync/once_lock.rs (1)

97-103: LGTM: Drop implementation is correct.

The Drop implementation properly checks initialization before dropping the inner value, and uses drop_in_place correctly through a raw pointer cast.

crates/vm/src/object/core.rs (3)

870-888: Good: Proper finalization guard prevents double __del__ calls.

The is_finalized/mark_finalized pattern correctly prevents __del__ from being called twice on resurrected objects. This aligns with CPython's behavior.


1290-1307: LGTM: GC tracking during object creation.

The tracking logic correctly tracks objects with IS_TRACE or instance dict, matching the untracking condition in drop_slow. The maybe_collect call enables automatic GC triggering.


831-856: Resurrection detection logic is sound.

The RefCount::inc() behavior is correctly implemented as documented: when called on strong count 0, it performs a double increment (0→2) by executing two separate atomic operations when the first increment observes strong count = 0. This is intentional for thread safety, as confirmed in the RefCount implementation (crates/common/src/refcount.rs:273-276).

The resurrection detection correctly leverages this: after inc(), the expected strong count is 2. If after_del exceeds this, the __del__ method created new references and the object is resurrected; the second dec() is skipped to keep the object alive. Otherwise, both decrements run to return the count to 0, matching the double increment. The logic is correct.

crates/stdlib/src/gc.rs (1)

50-86: LGTM: collect function correctly implements generation-based collection.

The implementation properly validates generation bounds, invokes start/stop callbacks, forces collection even when GC is disabled (matching CPython behavior), and transfers garbage to the VM context.

crates/common/src/lib.rs (1)

17-17: LGTM: EBR module export.

The new ebr module is correctly added to the public API surface, enabling epoch-based reclamation functionality across the crate.

crates/vm/src/lib.rs (1)

80-80: LGTM: GC state module export.

The gc_state module is correctly exposed as a public module, enabling GC functionality to be accessed from the stdlib and other components.

crates/vm/src/builtins/str.rs (1)

1928-1938: LGTM: Correct traversal implementation for PyUtf8Str.

PyUtf8Str correctly implements MaybeTraverse:

  • IS_TRACE = true enables GC participation
  • HAS_POP_EDGES = false is correct since strings don't hold references to other Python objects
  • try_pop_edges is appropriately a no-op
crates/vm/src/object/traverse.rs (3)

16-17: LGTM!

The HAS_POP_EDGES flag follows the same pattern as IS_TRACE and provides a clean compile-time capability check for the pop_edges functionality.


21-21: LGTM!

The try_pop_edges method signature and default implementation are appropriate. The &mut self parameter correctly allows implementations to take ownership of child references during GC cleanup.


36-39: LGTM!

The pop_edges method is well-documented with clear semantics for circular reference resolution. The default no-op implementation is safe, and the relevant code snippets show that concrete types like PyDict, PyList, and PyTuple properly implement this method to extract their owned child references.

crates/vm/src/vm/context.rs (3)

55-57: LGTM!

The new GC state fields are appropriately typed as PyListRef and clearly documented. The public visibility is correct for VM-level GC integration.


336-338: LGTM!

The initialization of gc_callbacks and gc_garbage follows the correct pattern used for other context objects like empty_bytes. Both lists are properly tracked by the GC system via PyRef::new_ref.


360-361: LGTM!

The struct initialization correctly includes the new GC fields.

crates/common/src/ebr/sync/mod.rs (1)

1-5: LGTM!

The synchronization primitives module structure is clean and appropriately scoped with pub(crate) visibility for internal EBR use.

crates/common/src/ebr/mod.rs (2)

1-17: LGTM!

The module documentation provides a clear and comprehensive explanation of Epoch-Based Reclamation, including key concepts like pinning and critical sections.


27-30: LGTM!

The re-exports provide a clean public API surface, making core EBR types and functions easily accessible through the module root.

crates/vm/src/builtins/super.rs (2)

64-64: LGTM!

The type signature change from PyTypeRef to PyObjectRef provides more flexibility and aligns with Python's dynamic type system, where super() can accept any object that is then checked at runtime.


78-81: LGTM!

The downcast logic correctly validates that the first argument to super() is a type object. The error message is clear and matches Python conventions, and the error handling via map_err is appropriate.

crates/vm/src/builtins/list.rs (2)

60-74: LGTM on the Traverse implementation.

The unsafe impl Traverse correctly visits all owned PyObjectRefs. The pop_edges implementation using try_write() is appropriate for GC scenarios where the object should be unreachable.

One minor observation: if try_write() returns None (lock contention), elements won't be popped. The comment explains this is safe because objects should be unreachable during GC, but you may want to consider adding a debug assertion or logging for the failure case to catch unexpected contention during development.


27-33: Pyclass attributes look correct.

The traverse = "manual" and pop_edges attributes properly indicate that this type provides a manual Traverse implementation with edge-popping capability for GC, consistent with the pattern used for PyDict and PyTuple.

crates/vm/src/vm/thread.rs (4)

15-20: Thread-local EBR guard storage looks correct.

The RefCell<Option<Guard>> pattern is appropriate for thread-local state that needs interior mutability. The doc comment clearly explains this is for coarse-grained pinning.


24-36: LGTM on ensure_pinned().

The function correctly uses ebr::cs() to enter a critical section and stores the guard thread-locally. The idempotent check-then-set pattern is appropriate.


38-50: LGTM on reactivate_guard().

Correctly calls reactivate() on the existing guard to allow epoch advancement at safe points. The pattern of borrowing mutably only when a guard exists is efficient.


52-59: LGTM on drop_guard().

Setting the guard to None will drop the Guard, which unpins the thread as shown in the relevant snippet from guard.rs (the Drop impl calls local.unpin()).

crates/vm/src/vm/mod.rs (2)

475-480: Correct CPython-aligned behavior for unraisable exceptions during shutdown.

Suppressing unraisable exceptions when finalizing is true matches CPython's behavior where daemon thread exceptions and __del__ errors are silently ignored during interpreter shutdown.


533-538: Good placement for EBR guard reactivation.

Frame boundaries are appropriate safe points for reactivating the EBR guard. This allows the GC to advance epochs and free deferred objects while ensuring no object references are held across the reactivation.

The #[cfg(feature = "threading")] guard is correct since EBR is only relevant in threaded contexts.

crates/vm/src/vm/interpreter.rs (2)

134-145: Good shutdown sequence for thread cleanup.

Setting the finalizing flag before calling threading._shutdown() ensures that any exceptions from daemon threads during shutdown are suppressed. The error handling with let _ = shutdown.call((), vm) silently ignores failures, which is appropriate during finalization.


149-158: Code pattern verified — both gc_state() and collect_force() are properly exposed and functional.

The two-phase GC collection around module finalization is correct:

  1. First pass collects existing cycles before breaking module references
  2. Module cleanup breaks references to enable collection
  3. Second pass collects the newly unreachable cyclic garbage

Both crate::gc_state::gc_state() (public function returning &'static GcState) and collect_force() (public method accepting generation parameter) are properly defined and accessible.

crates/derive-impl/src/pystructseq.rs (1)

610-622: LGTM on MaybeTraverse implementation for struct sequences.

The implementation correctly:

  • Sets IS_TRACE = true to enable traversal
  • Sets HAS_POP_EDGES = false since edge popping is delegated to the inner PyTuple
  • Provides a no-op try_pop_edges with appropriate comment
  • Delegates try_traverse to the inner tuple (line 616)

This follows the same pattern as PyUtf8Str and other wrapper types.

crates/vm/src/builtins/tuple.rs (2)

40-52: Correct Traverse implementation for PyTuple.

The implementation is sound:

  • traverse properly visits all elements via delegation
  • pop_edges uses std::mem::take which replaces self.elements with an empty boxed slice and returns the original
  • Converting to Vec via into_vec() and extending is correct

Unlike PyList, no locking is needed here because tuples are conceptually immutable (the elements are fixed at construction), and during GC the object is unreachable anyway.


28-28: Pyclass attributes correctly configured.

The traverse = "manual" and pop_edges attributes match the manual Traverse implementation below.

crates/common/src/ebr/default.rs (3)

33-41: Good fallback handling in with_handle.

The try_with with unwrap_or_else fallback correctly handles the case where the thread-local HANDLE has been destroyed (e.g., during thread exit). By registering a new temporary handle with the collector, the function remains functional even during thread teardown.

This is validated by the pin_while_exiting test which demonstrates pinning after HANDLE is dropped doesn't panic.


11-26: Clean EBR default module design.

The module correctly implements lazy initialization:

  • OnceLock ensures the global Collector is initialized exactly once
  • Thread-local HANDLE is registered lazily on first use per thread
  • cs() provides a simple API for entering critical sections

48-76: Good test coverage for edge case.

The pin_while_exiting test validates that calling cs() during thread exit (after HANDLE is dropped but before other thread-locals like FOO) doesn't panic. This is an important safety property for the EBR system.

crates/vm/src/object/traverse_object.rs (2)

17-19: LGTM! Documentation and signature are clear.

The pop_edges hook is well-documented and correctly typed as an optional function pointer for extracting child references before deallocation.


34-40: LGTM! Conditional initialization follows the existing trace pattern.

The const block pattern mirrors the trace field initialization, maintaining consistency in how optional vtable hooks are configured based on type traits.

crates/common/src/ebr/deferred.rs (1)

74-77: call consumes self but doesn't prevent double-drop.

The call method consumes self, which is correct. However, since there's no Drop impl, ensure that Deferred instances are always consumed via call() and never dropped without invocation. The current test coverage suggests this is the intended usage pattern.

crates/common/src/ebr/guard.rs (2)

116-136: Good use of scopeguard::defer! for panic safety in reactivate_after.

The implementation correctly uses scopeguard::defer! to ensure the guard is re-pinned even if the provided function panics. The mem::forget(local.pin()) pattern properly transfers ownership of the new guard's lifetime to the existing Guard struct.


148-155: LGTM! Drop implementation correctly handles null local.

The as_ref() pattern safely handles both regular guards and unprotected guards (with null local pointer).

crates/common/src/ebr/epoch.rs (2)

16-20: LGTM! Clear documentation of the bit layout.

The LSB-as-pinned-flag design is well-documented and the data: usize representation is appropriate for atomic operations.


62-70: LGTM! successor correctly preserves pin state.

Adding 2 increments the epoch value (stored in bits 1+) while preserving the LSB (pin state). This is correct given the bit layout.

crates/common/src/ebr/sync/list.rs (4)

219-233: Drop uses unprotected() guard - verify thread safety assumptions.

The Drop implementation uses unprotected() which bypasses EBR protection. This is only safe if:

  1. The list is not being concurrently accessed by other threads
  2. All elements have already been marked as deleted (checked by assertion)

The assertion on line 227 (assert_eq!(succ.tag(), 1)) will panic if any element wasn't properly deleted before the list is dropped. Consider using debug_assert! or returning an error if this is a recoverable condition.


162-194: LGTM! Insert uses correct lock-free CAS pattern.

The insert operation correctly:

  1. Reads current head
  2. Sets new entry's next to current head
  3. Uses compare_exchange_weak in a retry loop
  4. Uses appropriate memory orderings (Release for success, Relaxed for retry)

239-296: Iterator correctly handles concurrent modifications.

The iterator properly:

  1. Detects deleted nodes via tag bit
  2. Attempts to unlink deleted nodes via CAS
  3. Restarts from head when predecessor is deleted
  4. Returns IterError::Stalled to signal restart

The logic is correct for a lock-free list traversal with concurrent modifications.


306-321: Test IsElement implementation uses raw pointer cast.

The finalize implementation correctly defers destruction via the guard. The unsafe block is properly scoped with a safety comment.

crates/vm/src/builtins/dict.rs (1)

46-59: No action required. The pop_all_entries method exists and is correctly implemented in dict_inner.rs at line 715. It returns an iterator of (PyObjectRef, T) pairs, which matches the expected usage in pop_edges. The Traverse implementation properly delegates to self.entries.traverse and correctly collects both keys and values in pop_edges.

crates/vm/src/builtins/function.rs (1)

85-92: No safety issue with Context::genesis() during GC.

The code is safe. Context::genesis() uses Rust's static_cell with get_or_init() pattern, which is thread-safe and idempotent—it initializes exactly once and returns a &'static reference. The method cannot panic, even on first access. Additionally, garbage collection only runs after the VirtualMachine is fully initialized, and VM initialization explicitly calls Context::genesis() before any object creation. The pop_edges method will always have a valid, initialized genesis context available.

crates/vm/src/stdlib/thread.rs (1)

351-376: EBR pinning correctly brackets thread execution.

The ensure_pinned() at thread start and drop_guard() at thread exit properly implements coarse-grained EBR pinning. This ensures the GC won't reclaim objects while the thread is active.

One consideration: if the thread panics between ensure_pinned() and drop_guard(), the guard may not be dropped. However, since EBR_GUARD is thread-local, it will be cleaned up when the thread exits anyway.

crates/common/src/ebr/collector.rs (2)

104-423: Comprehensive test coverage for EBR pinning and reclamation.

The tests cover critical scenarios: reentrant pinning, local bag flushing, garbage buffering, epoch advancement under pinning, stress tests with multiple threads, and array destruction. Good use of #[cfg(miri)] for reduced iteration counts during Miri testing.


15-16: Manual Send and Sync implementations are necessary and appropriately justified.

The Collector wraps Arc<Global>, which is not automatically Send + Sync. This is evidenced by the #[allow(clippy::arc_with_non_send_sync)] lint suppression in Collector::default(), which the Rust clippy linter uses to warn when Arc wraps types that don't implement Send or Sync. Since Global contains List<Local> with intrusive pointers to Local entries—which themselves contain UnsafeCell—the manual implementations are correctly required to affirm thread-safety. These impls should remain in place.

crates/common/src/ebr/sync/queue.rs (4)

40-42: Manual Send and Sync impls are appropriate for the queue.

The safety comment on line 40 is accurate: individual T values are not accessed concurrently since they're only read/written during push/pop which are atomic operations. The Send bound on T is correctly required.


141-169: pop_if_internal reads data before CAS - potential for reading invalid data if condition races.

The condition reads n.data.as_ptr() at line 151 before the CAS at line 153. If another thread pops this node between the read and the CAS:

  1. The CAS will fail (returning Err)
  2. But the condition was evaluated on potentially stale/freed data

However, since the guard is held during this operation, EBR guarantees the memory won't be reclaimed until the guard is dropped. The T: Sync bound ensures the read is safe.


201-213: Drop implementation correctly drains queue before destroying sentinel.

The unprotected() guard usage in Drop is safe because drop has exclusive access (&mut self). The sentinel node is properly destroyed after draining all data nodes.


249-257: Test wrapper's pop() spins indefinitely - acceptable for tests but would deadlock on empty queue.

The pop method in the test wrapper loops forever until an element is available. This is fine for test scenarios where producers are guaranteed to push, but the comment or implementation could note this is intentional for testing.

crates/common/src/ebr/internal.rs (4)

134-141: SealedBag expiration uses 3-epoch distance - matches EBR guarantees.

The is_expired check at line 139 ensures bags are only reclaimed when global_epoch.wrapping_sub(self.epoch) >= 3. This is correct for EBR: a pinned thread can observe at most one epoch advancement, so 3 epochs guarantees all pinned threads have unpinned since the bag was sealed.


390-454: Local::pin implements correct EBR pinning with x86 optimization.

The pinning logic:

  1. Increments guard_count (lines 395-396)
  2. On first pin (guard_count == 0), stores pinned epoch with proper synchronization
  3. Uses x86-specific compare_exchange optimization (lines 406-434) for performance
  4. Retries if global epoch advanced during pinning (lines 440-444)
  5. Resets advance counter on epoch change (lines 447-450)

The x86 hack using compare_exchange instead of mfence is a known optimization from crossbeam-epoch.


456-480: Local::unpin triggers collection when last guard is dropped.

The unpin logic at lines 460-470 triggers collection while still pinned (guard_count == 1). This is intentional - collection requires being pinned to safely traverse the object graph. The collecting flag prevents reentrancy.

Note: Line 472 decrements guard_count after the collection loop, which is correct - the ManuallyDrop guard at line 465 prevents double-decrement.


573-593: IsElement trait implementation uses offset_of! for intrusive list.

The entry_of and element_of implementations correctly convert between Local and Entry using offset_of!. The finalize method defers destruction of the Local node.

Line 590: The RawShared::from conversion from *const Local is safe because the caller guarantees the entry belongs to a valid Local.

crates/common/src/ebr/pointers.rs (3)

62-72: High tag constants and bit manipulation are correct.

HIGH_TAG_WIDTH = 4 uses the top 4 bits of the pointer. The high_bits() and high_bits_pos() calculations are correct:

  • high_bits_pos() = 64 - 4 = 60 (on 64-bit)
  • high_bits() = 0xF << 60 = 0xF000_0000_0000_0000

This reserves the top 4 bits for epoch tagging while preserving pointer validity in the middle bits.


148-156: low_bits<T>() correctly computes alignment-based tag mask.

The function (1 << align_of::<T>().trailing_zeros()) - 1 computes the maximum value that fits in the alignment bits. For example, if align_of::<T>() == 8, this returns 0b111 = 7.


258-306: RawShared provides safe wrappers around unsafe pointer operations.

The from_owned, drop, deref, and as_ref methods correctly document their safety requirements. The #[cfg(test)] on is_null keeps it out of production builds while available for testing.

crates/common/src/refcount.rs (4)

301-312: dec() correctly handles LEAKED objects.

The decrement returns false for leaked objects (line 307-309), preventing deallocation of interned objects. The check old.strong() == 1 (line 311) correctly returns true only when this was the last reference.


382-448: Deferred drop infrastructure is well-designed for deadlock prevention.

The thread-local IN_DEFERRED_CONTEXT and DEFERRED_QUEUE pattern allows deferring untrack calls during GC collection. Key points:

  • with_deferred_drops sets context and flushes on exit (lines 400-412)
  • try_defer_drop queues if in context, else executes immediately (lines 425-433)
  • flush_deferred_drops drains queue outside borrow (lines 440-447)

The comment at lines 378-380 explains the deadlock scenario this prevents.


461-468: defer_destruction is marked unsafe but doesn't validate guard.

The safety comment requires the guard to be valid, but there's no runtime check. This is acceptable since it's unsafe, but consider adding a debug assertion.


264-277: Clarify the strong==0 case and document abort rationale.

Line 273-276: The comment "The previous fetch_add created a permission to run decrement again" is cryptic and doesn't explain why a second fetch_add is needed when strong == 0. This appears to handle a race condition with concurrent decrements, but the logic should be documented explicitly.

Lines 268-272: The abort() call on a destructed object is a severe failure (kills the process). Document why this is preferred over panic!() or unreachable!(). Note that safe_inc() (lines 282-299) handles the destructed case more gracefully with compare-exchange; clarify which path is recommended and why inc() takes the fatal approach.

crates/vm/src/gc_state.rs (2)

85-92: GcObjectPtr manual Send/Sync requires careful justification.

The wrapper makes NonNull<PyObject> Send+Sync. The safety comment "proper synchronization is used" is vague. The actual safety relies on:

  1. Objects being accessed through PyObjectRef which manages synchronization
  2. GcObjectPtr only being used as a key (for hashing/equality), not for dereferencing without synchronization

Consider strengthening the safety comment to document these invariants.


735-744: Global singleton pattern using OnceLock is appropriate.

The GC_STATE singleton with OnceLock provides thread-safe lazy initialization. The gc_state() accessor is clean and idiomatic.

Comment on lines +39 to +43
/// Reads the global epoch, without issueing a fence.
#[inline]
pub fn global_epoch(&self) -> Epoch {
self.global.epoch.load(Ordering::Relaxed)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in doc comment: "issueing" should be "issuing".

🔎 Proposed fix
-    /// Reads the global epoch, without issueing a fence.
+    /// Reads the global epoch, without issuing a fence.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Reads the global epoch, without issueing a fence.
#[inline]
pub fn global_epoch(&self) -> Epoch {
self.global.epoch.load(Ordering::Relaxed)
}
/// Reads the global epoch, without issuing a fence.
#[inline]
pub fn global_epoch(&self) -> Epoch {
self.global.epoch.load(Ordering::Relaxed)
}
🤖 Prompt for AI Agents
In crates/common/src/ebr/collector.rs around lines 39 to 43, the doc comment for
global_epoch contains a typo: "issueing" should be corrected to "issuing";
update the doc comment to read "Reads the global epoch, without issuing a
fence." so the documentation is spelled correctly.

Comment on lines +18 to +22
pub(crate) struct Deferred {
call: unsafe fn(*mut u8),
data: MaybeUninit<Data>,
_marker: PhantomData<*mut ()>, // !Send + !Sync
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing Drop implementation may cause memory leaks.

Deferred stores a FnOnce() closure inline or boxed, but has no Drop implementation. If a Deferred is dropped without call() being invoked, the stored closure (and any captured data) will be leaked. This is particularly concerning for the heap-allocated path where a Box<F> is stored.

Consider adding a Drop impl that invokes the closure or at least drops the stored data, unless leaking is intentional for the EBR design (in which case, document this behavior).

🔎 Proposed Drop implementation
+impl Drop for Deferred {
+    fn drop(&mut self) {
+        // Call the deferred function to ensure it's not leaked
+        // This is safe because we own self
+        let call = self.call;
+        unsafe { call(self.data.as_mut_ptr().cast::<u8>()) };
+    }
+}

Alternatively, if leaking is intentional, add documentation explaining why.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/common/src/ebr/deferred.rs around lines 18-22, Deferred currently
holds a closure (inline or boxed) but has no Drop implementation, so if a
Deferred is dropped without call() being invoked the stored closure/data will be
leaked; add a Drop impl that safely releases stored resources: on drop, check
the active variant/state and either invoke the stored destructor function (the
unsafe fn(*mut u8)) against the data buffer to run/drop the closure or, if
invoking is unsafe for EBR semantics, explicitly run the correct destructor for
the boxed path and drop the inline data to avoid leaks; ensure the Drop uses the
same unsafe logic as call(), avoids double-running if call() was already
executed (e.g. by clearing a flag or setting data to a neutral state), and
documents the chosen behavior if intentional leaking is preserved.

Comment on lines +32 to +44
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows.
///
/// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
/// the closure must be `Send`.
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows. The closure must be `Send` as another thread may execute it.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate safety documentation.

The safety documentation for defer_unchecked is duplicated (lines 32-39 and 40-44). Remove the duplicate section.

🔎 Proposed fix
     /// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
     /// the closure must be `Send`.
-    /// # Safety
-    ///
-    /// The given function must not hold reference onto the stack. It is highly recommended that
-    /// the passed function is **always** marked with `move` in order to prevent accidental
-    /// borrows. The closure must be `Send` as another thread may execute it.
     pub unsafe fn defer_unchecked<F, R>(&self, f: F)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows.
///
/// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
/// the closure must be `Send`.
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows. The closure must be `Send` as another thread may execute it.
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows.
///
/// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
/// the closure must be `Send`.
pub unsafe fn defer_unchecked<F, R>(&self, f: F)
🤖 Prompt for AI Agents
crates/common/src/ebr/guard.rs lines 32-44: the safety documentation for
defer_unchecked is duplicated across two blocks; remove the redundant copy so
only a single Safety section remains, keeping the more concise/accurate wording
(ensure it states that the closure must not hold stack references, should be
marked move to avoid borrows, and must be Send), and adjust surrounding comment
spacing to preserve formatting.

Comment on lines +208 to +215
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
// HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
// The equality of the sizes is checked by `const_assert!`.
let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };
let prev = inner.fetch_or(low_bits::<T>() & tag, order);
RawShared::from(prev as *const _)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

fetch_or uses unsafe transmute to AtomicUsize - missing size assertion.

The comment mentions "The equality of the sizes is checked by const_assert!" but no const_assert! is present in this file. This is a latent bug if Atomic<Tagged<T>> ever differs in size from AtomicUsize.

🔎 Add the missing size assertion
+use static_assertions::const_assert_eq;
+const_assert_eq!(core::mem::size_of::<Atomic<Tagged<()>>>(), core::mem::size_of::<AtomicUsize>());
+
 pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
     // HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
     // The equality of the sizes is checked by `const_assert!`.

Alternatively, add a debug_assert at runtime:

 pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
+    debug_assert_eq!(
+        core::mem::size_of::<Atomic<Tagged<T>>>(),
+        core::mem::size_of::<AtomicUsize>()
+    );
     let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
// HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
// The equality of the sizes is checked by `const_assert!`.
let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };
let prev = inner.fetch_or(low_bits::<T>() & tag, order);
RawShared::from(prev as *const _)
}
}
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
debug_assert_eq!(
core::mem::size_of::<Atomic<Tagged<T>>>(),
core::mem::size_of::<AtomicUsize>()
);
// HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
// The equality of the sizes is checked by `const_assert!`.
let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };
let prev = inner.fetch_or(low_bits::<T>() & tag, order);
RawShared::from(prev as *const _)
}
}

Comment on lines +33 to +115
/// State wraps reference count + flags in a single 64-bit word
#[derive(Clone, Copy)]
pub struct State {
inner: u64,
}

impl State {
#[inline]
pub fn from_raw(inner: u64) -> Self {
Self { inner }
}

#[inline]
pub fn as_raw(self) -> u64 {
self.inner
}

#[inline]
pub fn strong(self) -> u32 {
((self.inner & STRONG) / COUNT) as u32
}

#[inline]
pub fn weak(self) -> u32 {
((self.inner & WEAK) / WEAK_COUNT) as u32
}

#[inline]
pub fn destructed(self) -> bool {
(self.inner & DESTRUCTED) != 0
}

#[inline]
pub fn weaked(self) -> bool {
(self.inner & WEAKED) != 0
}

#[inline]
pub fn leaked(self) -> bool {
(self.inner & LEAKED) != 0
}

#[inline]
pub fn epoch(self) -> u32 {
((self.inner & EPOCH) >> EPOCH_MASK_HEIGHT) as u32
}

#[inline]
pub fn with_epoch(self, epoch: usize) -> Self {
Self::from_raw((self.inner & !EPOCH) | (((epoch as u64) << EPOCH_MASK_HEIGHT) & EPOCH))
}

#[inline]
pub fn add_strong(self, val: u32) -> Self {
Self::from_raw(self.inner + (val as u64) * COUNT)
}

#[inline]
pub fn sub_strong(self, val: u32) -> Self {
debug_assert!(self.strong() >= val);
Self::from_raw(self.inner - (val as u64) * COUNT)
}

#[inline]
pub fn add_weak(self, val: u32) -> Self {
Self::from_raw(self.inner + (val as u64) * WEAK_COUNT)
}

#[inline]
pub fn with_destructed(self, dest: bool) -> Self {
Self::from_raw((self.inner & !DESTRUCTED) | if dest { DESTRUCTED } else { 0 })
}

#[inline]
pub fn with_weaked(self, weaked: bool) -> Self {
Self::from_raw((self.inner & !WEAKED) | if weaked { WEAKED } else { 0 })
}

#[inline]
pub fn with_leaked(self, leaked: bool) -> Self {
Self::from_raw((self.inner & !LEAKED) | if leaked { LEAKED } else { 0 })
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

State and PyState are identical - consider removing duplication.

PyState (lines 151-233) has the exact same fields and methods as State (lines 33-115). This violates DRY and creates maintenance burden. If they need to diverge in the future, document why. Otherwise, use a single type or a type alias.

🔎 Proposed fix - use type alias or single type
-/// PyState extends State with LEAKED support for RustPython
-#[derive(Clone, Copy)]
-pub struct PyState {
-    inner: u64,
-}
-
-impl PyState {
-    // ... 80+ lines of duplicated methods ...
-}
+/// PyState is an alias for State, used in RustPython integration
+pub type PyState = State;

If distinct types are truly needed for type safety, consider a generic or trait-based approach.

Also applies to: 151-233

🤖 Prompt for AI Agents
crates/common/src/refcount.rs lines 33-115 (and PyState at 151-233): the State
and PyState implementations are identical, so remove duplication by
consolidating into a single definition; either replace PyState with a type alias
(pub type PyState = State) and remove the duplicate struct/impl, or keep one
struct (State) and extract shared methods into a trait/generic used by both
names, updating all call sites to use the unified type; ensure visibility (pub)
and any module exports are preserved and add a short comment if you
intentionally keep a separate alias for future divergence.

Comment on lines 576 to +607
const ALLOWED_TRAVERSE_OPTS: &[&str] = &["manual"];
// try to know if it have a `#[pyclass(trace)]` exist on this struct
// TODO(discord9): rethink on auto detect `#[Derive(PyTrace)]`

// 1. no `traverse` at all: generate a dummy try_traverse
// 2. `traverse = "manual"`: generate a try_traverse, but not #[derive(Traverse)]
// 3. `traverse`: generate a try_traverse, and #[derive(Traverse)]
let (maybe_trace_code, derive_trace) = {
if class_meta.inner()._has_key("traverse")? {
let maybe_trace_code = quote! {
impl ::rustpython_vm::object::MaybeTraverse for #ident {
const IS_TRACE: bool = true;
fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) {
::rustpython_vm::object::Traverse::traverse(self, tracer_fn);
}
// Generate MaybeTraverse impl with both traverse and pop_edges support
//
// For traverse:
// 1. no `traverse` at all: IS_TRACE = false, try_traverse does nothing
// 2. `traverse = "manual"`: IS_TRACE = true, but no #[derive(Traverse)]
// 3. `traverse`: IS_TRACE = true, and #[derive(Traverse)]
//
// For pop_edges:
// 1. no `pop_edges`: HAS_POP_EDGES = false, try_pop_edges does nothing
// 2. `pop_edges`: HAS_POP_EDGES = true, try_pop_edges calls Traverse::pop_edges
let has_traverse = class_meta.inner()._has_key("traverse")?;
let has_pop_edges = class_meta.inner()._has_key("pop_edges")?;

let derive_trace = if has_traverse {
let value = class_meta.inner()._optional_str("traverse");
if let Ok(Some(s)) = value {
if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) {
bail_span!(
item,
"traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all",
);
}
assert_eq!(s, "manual");
quote! {}
} else {
quote! {#[derive(Traverse)]}
}
} else {
quote! {}
};

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Don’t ignore errors from _optional_str("traverse") when deriving Traverse

The new derive_trace logic is almost there, but this pattern:

let value = class_meta.inner()._optional_str("traverse");
if let Ok(Some(s)) = value {
    // ...
} else {
    quote! {#[derive(Traverse)]}
}

silently treats any Err(_) from _optional_str as “no value, derive Traverse”, rather than surfacing a compile‑time error. That means malformed uses like #[pyclass(traverse(123))] could compile and unexpectedly auto‑derive Traverse instead of failing in the macro.

You can preserve the intended behavior while propagating parse errors by binding the Result with ? first:

Proposed fix for error propagation in traverse handling
-    let derive_trace = if has_traverse {
-        let value = class_meta.inner()._optional_str("traverse");
-        if let Ok(Some(s)) = value {
+    let derive_trace = if has_traverse {
+        let value = class_meta.inner()._optional_str("traverse")?;
+        if let Some(s) = value {
             if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) {
                 bail_span!(
                     item,
                     "traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all",
                 );
             }
             assert_eq!(s, "manual");
             quote! {}
         } else {
             quote! {#[derive(Traverse)]}
         }
     } else {
         quote! {}
     };
🤖 Prompt for AI Agents
crates/derive-impl/src/pyclass.rs lines 576-607: the code currently ignores Err
from _optional_str("traverse") which can hide parse errors; change to propagate
the Result by using let value = class_meta.inner()._optional_str("traverse")?;
then match on value (Some(s) => validate s is "manual" and return quote!{} for
manual, None => return quote!{#[derive(Traverse)]}) so any Err is returned to
the caller instead of silently treating it as None.

Comment on lines +191 to 198
/// Return the list of objects that directly refer to any of the arguments.
#[pyfunction]
fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult {
Err(vm.new_not_implemented_error(""))
fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
// This is expensive: we need to scan all tracked objects
// For now, return an empty list (would need full object tracking to implement)
let _ = args;
vm.ctx.new_list(vec![])
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Incomplete implementation: get_referrers always returns an empty list.

This function is documented to "Return the list of objects that directly refer to any of the arguments" but always returns an empty list. This could cause subtle bugs in code relying on this CPython feature.

Consider either:

  1. Implementing it properly by scanning all tracked objects
  2. Raising NotImplementedError to make the limitation explicit
  3. Adding a clear warning in the docstring
+    /// Return the list of objects that directly refer to any of the arguments.
+    /// 
+    /// NOTE: This is currently not implemented and always returns an empty list.
+    /// Implementing this requires scanning all tracked objects which is expensive.
     #[pyfunction]
     fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
-        // This is expensive: we need to scan all tracked objects
-        // For now, return an empty list (would need full object tracking to implement)
         let _ = args;
+        // TODO: Implement by iterating gc_state().get_objects(None) and checking referents
         vm.ctx.new_list(vec![])
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Return the list of objects that directly refer to any of the arguments.
#[pyfunction]
fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult {
Err(vm.new_not_implemented_error(""))
fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
// This is expensive: we need to scan all tracked objects
// For now, return an empty list (would need full object tracking to implement)
let _ = args;
vm.ctx.new_list(vec![])
}
/// Return the list of objects that directly refer to any of the arguments.
///
/// NOTE: This is currently not implemented and always returns an empty list.
/// Implementing this requires scanning all tracked objects which is expensive.
#[pyfunction]
fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
let _ = args;
// TODO: Implement by iterating gc_state().get_objects(None) and checking referents
vm.ctx.new_list(vec![])
}
🤖 Prompt for AI Agents
In crates/stdlib/src/gc.rs around lines 191 to 198, get_referrers currently
returns an empty list which misrepresents its documented behavior; change it to
explicitly signal the limitation by returning/raising a NotImplementedError (or
the project-equivalent VM exception) instead of an empty list and update the
docstring to mention that get_referrers is not implemented on this platform;
ensure the function uses the VM to construct and return the proper exception so
callers receive an explicit error rather than an empty result.

Comment on lines +247 to +274
pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) {
let gc_ptr = GcObjectPtr(obj);

// Remove from all generation tracking lists
for generation in &self.generation_objects {
if let Ok(mut gen_set) = generation.write() {
if gen_set.remove(&gc_ptr) {
break; // Object can only be in one generation
}
}
}

// Update counts (simplified - just decrement gen0 for now)
let count = self.generations[0].count.load(Ordering::SeqCst);
if count > 0 {
self.generations[0].count.fetch_sub(1, Ordering::SeqCst);
}

// Remove from global tracking
if let Ok(mut tracked) = self.tracked_objects.write() {
tracked.remove(&gc_ptr);
}

// Remove from finalized set
if let Ok(mut finalized) = self.finalized_objects.write() {
finalized.remove(&gc_ptr);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

untrack_object always decrements gen0 count - incorrect for objects in other generations.

Lines 259-263 always decrement generations[0].count, even though the object might be in gen1 or gen2. This will cause count underflow/mismatch over time.

🔎 Proposed fix - track which generation the object was in
 pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) {
     let gc_ptr = GcObjectPtr(obj);
+    let mut found_gen: Option<usize> = None;

     // Remove from all generation tracking lists
-    for generation in &self.generation_objects {
+    for (gen_idx, generation) in self.generation_objects.iter().enumerate() {
         if let Ok(mut gen_set) = generation.write() {
             if gen_set.remove(&gc_ptr) {
+                found_gen = Some(gen_idx);
                 break; // Object can only be in one generation
             }
         }
     }

-    // Update counts (simplified - just decrement gen0 for now)
-    let count = self.generations[0].count.load(Ordering::SeqCst);
-    if count > 0 {
-        self.generations[0].count.fetch_sub(1, Ordering::SeqCst);
+    // Decrement the correct generation's count
+    if let Some(gen_idx) = found_gen {
+        self.generations[gen_idx].count.fetch_sub(1, Ordering::SeqCst);
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) {
let gc_ptr = GcObjectPtr(obj);
// Remove from all generation tracking lists
for generation in &self.generation_objects {
if let Ok(mut gen_set) = generation.write() {
if gen_set.remove(&gc_ptr) {
break; // Object can only be in one generation
}
}
}
// Update counts (simplified - just decrement gen0 for now)
let count = self.generations[0].count.load(Ordering::SeqCst);
if count > 0 {
self.generations[0].count.fetch_sub(1, Ordering::SeqCst);
}
// Remove from global tracking
if let Ok(mut tracked) = self.tracked_objects.write() {
tracked.remove(&gc_ptr);
}
// Remove from finalized set
if let Ok(mut finalized) = self.finalized_objects.write() {
finalized.remove(&gc_ptr);
}
}
pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) {
let gc_ptr = GcObjectPtr(obj);
let mut found_gen: Option<usize> = None;
// Remove from all generation tracking lists
for (gen_idx, generation) in self.generation_objects.iter().enumerate() {
if let Ok(mut gen_set) = generation.write() {
if gen_set.remove(&gc_ptr) {
found_gen = Some(gen_idx);
break; // Object can only be in one generation
}
}
}
// Decrement the correct generation's count
if let Some(gen_idx) = found_gen {
self.generations[gen_idx].count.fetch_sub(1, Ordering::SeqCst);
}
// Remove from global tracking
if let Ok(mut tracked) = self.tracked_objects.write() {
tracked.remove(&gc_ptr);
}
// Remove from finalized set
if let Ok(mut finalized) = self.finalized_objects.write() {
finalized.remove(&gc_ptr);
}
}

Comment on lines +1064 to +1069
pub unsafe fn gc_pop_edges(&self) -> Vec<PyObjectRef> {
// SAFETY: During GC collection, this object is unreachable (gc_refs == 0),
// meaning no other code has a reference to it. The only references are
// internal cycle references which we're about to break.
unsafe { Self::gc_pop_edges_raw(self as *const _ as *mut PyObject) }
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential undefined behavior: casting &self to *mut violates aliasing rules.

gc_pop_edges takes &self but immediately casts to *mut PyObject. Even with the safety comment about exclusive access during GC, creating a mutable pointer from a shared reference can violate Rust's aliasing rules and trigger UB under strict interpretations.

Consider either:

  1. Using UnsafeCell in the payload types that need mutation during GC
  2. Taking *mut PyObject directly in the public API
  3. Using &mut self if exclusive access is truly guaranteed
🔎 Suggested fix
-    pub unsafe fn gc_pop_edges(&self) -> Vec<PyObjectRef> {
-        // SAFETY: During GC collection, this object is unreachable (gc_refs == 0),
-        // meaning no other code has a reference to it. The only references are
-        // internal cycle references which we're about to break.
-        unsafe { Self::gc_pop_edges_raw(self as *const _ as *mut PyObject) }
-    }
+    /// Pop edges from this object for cycle breaking.
+    /// Prefer using gc_pop_edges_raw with a pointer obtained safely.
+    pub unsafe fn gc_pop_edges_raw(ptr: *mut PyObject) -> Vec<PyObjectRef> {
+        let mut result = Vec::new();
+        let obj = unsafe { &*ptr };
+        if let Some(pop_edges_fn) = obj.0.vtable.pop_edges {
+            unsafe { pop_edges_fn(ptr, &mut result) };
+        }
+        result
+    }

Remove the &self variant and use only the raw pointer version.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/vm/src/object/core.rs around lines 1064 to 1069, casting the shared
reference &self to *mut PyObject can violate Rust aliasing rules and cause
undefined behavior; change the API to avoid creating a mutable pointer from
&self by either (a) changing this method to take *mut PyObject (delegate to
gc_pop_edges_raw directly) or (b) accept &mut self when exclusive access is
guaranteed, or (c) make the interior mutability explicit with UnsafeCell for the
fields mutated during GC; update callers to pass the raw pointer or mutable
reference accordingly and remove the &self wrapper overload if you adopt the
raw-pointer approach.

Comment on lines 405 to +441
let (globals, filename, lineno) = if let Some(f) = f {
(f.globals.clone(), f.code.source_path, f.f_lineno())
} else if let Some(frame) = vm.current_frame() {
// We have a frame but it wasn't found during stack walking
(frame.globals.clone(), vm.ctx.intern_str("sys"), 1)
} else {
(vm.current_globals().clone(), vm.ctx.intern_str("sys"), 1)
// No frames on the stack - we're in interpreter shutdown or similar state
// Use __main__ globals if available, otherwise skip the warning
match vm
.sys_module
.get_attr(identifier!(vm, modules), vm)
.and_then(|modules| modules.get_item(vm.ctx.intern_str("__main__"), vm))
.and_then(|main_mod| main_mod.get_attr(identifier!(vm, __dict__), vm))
{
Ok(globals) => {
if let Ok(dict) = globals.downcast::<crate::builtins::PyDict>() {
(dict, vm.ctx.intern_str("sys"), 1)
} else {
// Cannot get globals, skip warning
return Ok((
vm.ctx.intern_str("sys").to_owned(),
1,
None,
vm.ctx.new_dict().into(),
));
}
}
Err(_) => {
// Cannot get __main__ globals, skip warning
return Ok((
vm.ctx.intern_str("sys").to_owned(),
1,
None,
vm.ctx.new_dict().into(),
));
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Shutdown fallback currently still emits warnings despite “skip warning” comment

In the final else branch of setup_context, both “Cannot get globals, skip warning” comments suggest the intent is to suppress the warning when there are no frames / __main__ globals. However, the function returns a normal (filename, lineno, module = None, registry = new_dict()) tuple; warn then still calls warn_explicit, which normalizes module from filename and proceeds to emit the warning.

If the goal is to truly skip warnings in these shutdown-like states, you’ll need a distinct signal (e.g., have setup_context return an Option and early‑return in warn, or return a sentinel registry/module that warn checks before calling warn_explicit). Otherwise, it’d be good to adjust or remove the “skip warning” comments to match the actual behavior and avoid confusion.

🤖 Prompt for AI Agents
In crates/vm/src/warn.rs around lines 405-441, the shutdown fallback branches
claim “skip warning” but still return a normal tuple so warn continues and emits
warnings; change the API so setup_context can signal “skip” (e.g., return
Option<(filename, lineno, module, registry)> or a Result with a distinct
variant) and update warn to early-return when setup_context returns None (or the
skip variant) so no warn_explicit is called during
interpreter-shutdown/no-globals cases; alternatively, if you prefer not to
change the signature, return a clear sentinel (like module
Some(vm.ctx.intern_str("__shutdown__")) or an empty registry flagged with a
special key) and add a check in warn to detect that sentinel and return
early—implement one of these approaches and remove or adjust the misleading
“skip warning” comments to match the new behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants