-
Notifications
You must be signed in to change notification settings - Fork 1
gc #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
gc #7
Conversation
- Fix clippy: use is_multiple_of() in ebr/internal.rs - Fix unsafe block requirement in ebr/sync/list.rs - Add Send + Sync impl for GcState to allow static storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
📝 WalkthroughWalkthroughImplements a complete Epoch-Based Reclamation (EBR) garbage collection system alongside a CPython-compatible generational GC. Introduces lock-free synchronization primitives, redesigns reference counting to use 64-bit state machines, adds GC lifecycle hooks to PyObject, and integrates coarse-grained pinning into thread execution and interpreter finalization for cycle detection and safe deferred cleanup. Changes
Sequence Diagram(s)sequenceDiagram
participant Thread as Thread (cs entry)
participant Guard as EBR Guard
participant Local as Local Handle
participant Global as Global Epoch
participant Queue as Deferred Queue
Thread->>Guard: pin() → Guard
Guard->>Local: acquire_handle()
Local->>Local: advance local epoch
rect rgb(220, 240, 255)
Thread->>Guard: defer_unchecked(closure)
Guard->>Local: defer(Deferred)
Local->>Local: push to thread-local bag
end
Thread->>Guard: (critical section)
Guard->>Local: unpin()
Local->>Local: push_to_global()
rect rgb(200, 220, 200)
Local->>Queue: seal bag with epoch
Global->>Queue: try_advance(current_epoch)
Global->>Queue: collect() - scan sealed bags
Queue->>Queue: execute deferred closures
end
sequenceDiagram
participant VM as VM (collect)
participant GC as GcState
participant Obj as PyObject
participant Heap as Heap
VM->>GC: collect(generation)
GC->>GC: gather tracked objects
GC->>Obj: traverse() for gc_refs
Obj->>Obj: count strong refs
GC->>GC: subtract internal refs
rect rgb(255, 240, 200)
GC->>Obj: is_finalized?
Obj->>Obj: invoke __del__ if not
Obj->>Obj: mark_finalized()
end
rect rgb(255, 220, 220)
GC->>Obj: gc_pop_edges()
Obj->>GC: break cycles
GC->>GC: clear weak refs
end
GC->>GC: promote survivors
GC->>Heap: dealloc unreachable
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🧹 Nitpick comments (16)
crates/vm/src/dict_inner.rs (1)
712-721: pop_all_entries keeps Dict invariants; consider reuse of clear semanticsThe method correctly:
- Obtains a mutable handle via
get_mut()(no lock, requires&mut self).- Resets
used/filledand marks all indicesFREEbefore draining entries.- Returns an iterator of
(PyObjectRef, T)for GC to consume, leaving the dict logically empty.One behavioral difference vs
clear()is thatindicescapacity is preserved instead of being reset to length 8. That’s probably fine for GC-focused use, but if this gets used on very large dicts in hot paths you might want to either reuseclear()’s resizing or document this capacity retention explicitly.crates/common/src/ebr/sync/once_lock.rs (1)
10-17: Consider usingOnce::is_completed()instead of a separateAtomicBool.The comment on line 12 mentions
Once::is_completedrequires Rust 1.43, but the workspace MSRV is 1.89.0 (per root Cargo.toml line 134). UsingOnce::is_completed()would eliminate theis_initializedfield and the associated synchronization overhead.🔎 Proposed simplification
pub(crate) struct OnceLock<T> { once: Once, - // Once::is_completed requires Rust 1.43, so use this to track of whether they have been initialized. - is_initialized: AtomicBool, value: UnsafeCell<MaybeUninit<T>>, - // Unlike std::sync::OnceLock, we don't need PhantomData here because - // we don't use #[may_dangle]. }Then replace
self.is_initialized()calls withself.once.is_completed().crates/vm/src/object/core.rs (1)
298-305: Subtle double-decrement prevention logic - add clarifying comment.The interaction between
already_clearedanddecrement_ref_countprevents double decrements when both GC anddrop_slow_innerclear weakrefs. However, the logic is subtle and could benefit from a more explicit comment explaining the scenario.// Only decrement ref_count if requested AND not already cleared + // This prevents double-decrement when: + // 1. GC calls clear_for_gc (decrement_ref_count=false) + // 2. Later, drop_slow_inner calls clear (decrement_ref_count=true) + // but obj is already None, so we skip the decrement if decrement_ref_count && !already_cleared {crates/stdlib/src/gc.rs (1)
261-266: Silent error ignoring in callback invocation.Errors from
set_itemandcallback.callare silently discarded. While CPython also handles callback errors gracefully, completely ignoring them could make debugging difficult.Consider logging errors or using
run_unraisablefor callback failures, similar to how__del__errors are handled in core.rs.- let _ = info.set_item("generation", vm.ctx.new_int(generation).into(), vm); - let _ = info.set_item("collected", vm.ctx.new_int(collected).into(), vm); - let _ = info.set_item("uncollectable", vm.ctx.new_int(uncollectable).into(), vm); + // These set_item calls should not fail for simple int values + info.set_item("generation", vm.ctx.new_int(generation).into(), vm).ok(); + info.set_item("collected", vm.ctx.new_int(collected).into(), vm).ok(); + info.set_item("uncollectable", vm.ctx.new_int(uncollectable).into(), vm).ok(); for callback in callbacks { - let _ = callback.call((phase_str.clone(), info.clone()), vm); + if let Err(e) = callback.call((phase_str.clone(), info.clone()), vm) { + // Handle callback errors gracefully, similar to __del__ errors + vm.run_unraisable(e, Some("gc callback".to_owned()), callback); + } }Cargo.toml (1)
208-209: Both dependencies are necessary for the EBR implementation.The
memoffsetcrate is actively used incrates/common/src/ebr/internal.rs(lines 45, 575, 582). Whilestd::mem::offset_of!is available (MSRV 1.89.0 exceeds its 1.77 stabilization), consolidating on the standard library version could be a future refactoring. Theatomiccrate is essential—it providesAtomic<T>for arbitrary types, whichstd::sync::atomicdoesn't offer (only concrete types likeAtomicUsize).crates/common/src/ebr/mod.rs (1)
23-23: Verify ifinternalshould be publicly exposed.The
internalmodule is declared aspub mod internal, which exposes implementation details. This might be intentional for testing or advanced use cases within the rustpython workspace, but it's worth confirming that this is the desired visibility rather thanpub(crate).🔎 Consider restricting visibility if not needed externally
If the
internalmodule is only used within this crate:-pub mod internal; +pub(crate) mod internal;If it needs to be accessible to other rustpython crates but not external users, consider using
pub(crate)or visibility modifiers likepub(in crate::ebr).crates/vm/src/vm/mod.rs (1)
981-1052: Consider making the stdlib module list more maintainable.The hardcoded list of stdlib module names (lines 1006-1029) and the path-based detection (
pylib,Lib) are fragile:
- The stdlib list may become stale as modules are added/removed
- Path detection may fail on non-standard installations or alternative stdlib locations
Consider one of these alternatives:
- Use a centralized constant or configuration for stdlib module names
- Mark stdlib modules with a metadata flag during import
- Check if the module originated from frozen modules or builtin inits
That said, this is a reasonable first implementation for breaking cycles during shutdown.
crates/vm/src/builtins/function.rs (2)
53-57: Consider traversingnameandqualnamefields.The
traversemethod visitstype_params,annotations,module, anddoc, but notnameandqualname. Since these can bePyStrsubclasses (as noted inpop_edges), they may also participate in reference cycles and should be traversed for consistency.🔎 Proposed addition to traverse
self.type_params.lock().traverse(tracer_fn); self.annotations.lock().traverse(tracer_fn); self.module.lock().traverse(tracer_fn); self.doc.lock().traverse(tracer_fn); + self.name.lock().traverse(tracer_fn); + self.qualname.lock().traverse(tracer_fn); }
67-74:try_lockfailure silently skips clearing defaults.If
try_lock()fails (returnsNone), the defaults and kwdefaults are not cleared. While this is likely intentional for GC safety (avoiding deadlock), it could lead to reference cycles not being broken if the lock is contended during collection. Consider adding a debug assertion or log to track this case.crates/common/src/ebr/epoch.rs (1)
33-38: Verifywrapping_subcorrectness for edge cases.The optimization in
wrapping_subrelies on the LSB difference being discarded by the right shift. The comment explains this, but consider adding a unit test to verify behavior at epoch boundaries (e.g., nearusize::MAX).crates/vm/src/stdlib/thread.rs (2)
445-469: Constructor uses indexing after length validation - safe but consider bounds-checked access.The constructor validates
seq.len() == 4before indexing, so this is safe. However, usingget()with pattern matching would be more idiomatic Rust and defensive against future refactoring.🔎 Optional improvement using pattern matching
- if seq.len() != 4 { - return Err(vm.new_type_error(format!( - "_ExceptHookArgs expected 4 arguments, got {}", - seq.len() - ))); - } - Ok(Self { - exc_type: seq[0].clone(), - exc_value: seq[1].clone(), - exc_traceback: seq[2].clone(), - thread: seq[3].clone(), - }) + match <[_; 4]>::try_from(seq) { + Ok([exc_type, exc_value, exc_traceback, thread]) => Ok(Self { + exc_type, + exc_value, + exc_traceback, + thread, + }), + Err(seq) => Err(vm.new_type_error(format!( + "_ExceptHookArgs expected 4 arguments, got {}", + seq.len() + ))), + }
499-522: Deep nesting in stderr fallback logic reduces readability.The nested
if let/if/elsechain for getting stderr is hard to follow. Consider extracting to a helper function or using early returns.🔎 Suggested refactor using a helper closure
+ // Helper to get stderr, with fallback to thread._stderr + let get_stderr = || -> Option<crate::PyObjectRef> { + if let Ok(stderr) = vm.sys_module.get_attr("stderr", vm) { + if !vm.is_none(&stderr) { + return Some(stderr); + } + } + let thread = thread.as_ref()?; + if vm.is_none(thread) { + return None; + } + let thread_stderr = thread.get_attr("_stderr", vm).ok()?; + if vm.is_none(&thread_stderr) { + return None; + } + Some(thread_stderr) + }; + - // Get stderr - fall back to thread._stderr if sys.stderr is None - let stderr = match vm.sys_module.get_attr("stderr", vm) { - Ok(stderr) if !vm.is_none(&stderr) => stderr, - _ => { - // ... deeply nested logic ... - } - }; + let stderr = match get_stderr() { + Some(s) => s, + None => return Ok(()), + };crates/common/src/ebr/internal.rs (1)
55-58: Static mutable variables are unsafe and difficult to use correctly.
MAX_OBJECTSandMANUAL_EVENTS_BETWEEN_COLLECTarestatic mut, requiringunsafeblocks for every access. Consider usingAtomicUsizeoronce_cell::sync::Lazyfor safe interior mutability, especially if these might be configurable at runtime.🔎 Suggested safer alternative
-/// Maximum number of objects a bag can contain. -static mut MAX_OBJECTS: usize = 64; - -#[allow(dead_code)] -static mut MANUAL_EVENTS_BETWEEN_COLLECT: usize = 64; +use std::sync::atomic::AtomicUsize; + +/// Maximum number of objects a bag can contain. +static MAX_OBJECTS: AtomicUsize = AtomicUsize::new(64); + +#[allow(dead_code)] +static MANUAL_EVENTS_BETWEEN_COLLECT: AtomicUsize = AtomicUsize::new(64);Then update usages:
-Bag(Vec::with_capacity(unsafe { MAX_OBJECTS })) +Bag(Vec::with_capacity(MAX_OBJECTS.load(Ordering::Relaxed)))crates/vm/src/gc_state.rs (2)
406-418: Holding read locks during entire collection may cause contention.Lines 406-408 acquire read locks on all generation sets and hold them until line 513/523. During this time,
track_objectanduntrack_objectfrom other threads will block on write locks.For a GC that may take significant time, this could cause allocation pauses. Consider copying the object set and releasing locks earlier.
🔎 Alternative: copy and release locks early
- let gen_locks: Vec<_> = (0..=generation) - .filter_map(|i| self.generation_objects[i].read().ok()) - .collect(); - let mut collecting: HashSet<GcObjectPtr> = HashSet::new(); - for gen_set in &gen_locks { - for &ptr in gen_set.iter() { + for i in 0..=generation { + if let Ok(gen_set) = self.generation_objects[i].read() { + for &ptr in gen_set.iter() { + let obj = unsafe { ptr.0.as_ref() }; + if obj.strong_count() > 0 { + collecting.insert(ptr); + } + } + } // Lock released here + }This trades off snapshot consistency for reduced lock contention.
690-712:promote_survivorshas O(n*m) complexity - can be optimized.For each survivor, the code iterates through all generations 0..=from_gen to find which one contains it (lines 700-710). Since objects should be in exactly one generation, this is inefficient.
A more efficient approach would track which generation each object is in, or iterate the generation sets instead of survivors.
🔎 Sketch of more efficient approach
fn promote_survivors(&self, from_gen: usize, survivors: &HashSet<GcObjectPtr>) { if from_gen >= 2 { return; } let next_gen = from_gen + 1; // Process each generation once instead of per-survivor for gen_idx in 0..=from_gen { if let Ok(mut gen_set) = self.generation_objects[gen_idx].write() { let to_promote: Vec<_> = gen_set.intersection(survivors).copied().collect(); for ptr in to_promote { gen_set.remove(&ptr); if let Ok(mut next_set) = self.generation_objects[next_gen].write() { next_set.insert(ptr); } } } } }crates/common/src/ebr/collector.rs (1)
70-96: LocalHandle safely stores a raw pointer viahandle_counttracking—add safety documentation to explain the invariants.The unsafe code is sound:
Localis heap-allocated viaRawShared::from_owned()and lives longer than anyLocalHandledue to thehandle_countmechanism (initialized to 1, incremented/decremented on handle creation/drop, and only finalized when bothhandle_countandguard_countreach 0). The design correctly follows the epoch-based reclamation pattern.Add a safety comment to
LocalHandledocumenting these invariants so maintainers understand why the unsafe code is safe.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (11)
Cargo.lockis excluded by!**/*.lockLib/test/support/__init__.pyis excluded by!Lib/**Lib/test/test_builtin.pyis excluded by!Lib/**Lib/test/test_dict.pyis excluded by!Lib/**Lib/test/test_gc.pyis excluded by!Lib/**Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_subprocess.pyis excluded by!Lib/**Lib/test/test_super.pyis excluded by!Lib/**Lib/test/test_threading.pyis excluded by!Lib/**Lib/test/test_weakref.pyis excluded by!Lib/**Lib/test/test_with.pyis excluded by!Lib/**
📒 Files selected for processing (40)
Cargo.tomlcrates/common/Cargo.tomlcrates/common/src/ebr/collector.rscrates/common/src/ebr/default.rscrates/common/src/ebr/deferred.rscrates/common/src/ebr/epoch.rscrates/common/src/ebr/guard.rscrates/common/src/ebr/internal.rscrates/common/src/ebr/mod.rscrates/common/src/ebr/pointers.rscrates/common/src/ebr/sync/list.rscrates/common/src/ebr/sync/mod.rscrates/common/src/ebr/sync/once_lock.rscrates/common/src/ebr/sync/queue.rscrates/common/src/lib.rscrates/common/src/refcount.rscrates/derive-impl/src/pyclass.rscrates/derive-impl/src/pystructseq.rscrates/derive-impl/src/util.rscrates/stdlib/src/fcntl.rscrates/stdlib/src/gc.rscrates/stdlib/src/ssl.rscrates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/str.rscrates/vm/src/builtins/super.rscrates/vm/src/builtins/tuple.rscrates/vm/src/dict_inner.rscrates/vm/src/gc_state.rscrates/vm/src/lib.rscrates/vm/src/object/core.rscrates/vm/src/object/traverse.rscrates/vm/src/object/traverse_object.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/context.rscrates/vm/src/vm/interpreter.rscrates/vm/src/vm/mod.rscrates/vm/src/vm/thread.rscrates/vm/src/warn.rs
🧰 Additional context used
🧬 Code graph analysis (24)
crates/vm/src/vm/context.rs (2)
crates/vm/src/object/core.rs (2)
new_ref(1290-1307)default(342-344)crates/vm/src/builtins/list.rs (1)
new_ref(91-93)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
gc_state(742-744)
crates/vm/src/builtins/dict.rs (2)
crates/vm/src/builtins/list.rs (2)
traverse(62-64)pop_edges(66-73)crates/vm/src/object/traverse.rs (9)
traverse(34-34)traverse(43-45)traverse(49-51)traverse(55-55)traverse(60-64)traverse(72-76)traverse(84-88)traverse(96-100)pop_edges(39-39)
crates/common/src/ebr/default.rs (4)
crates/common/src/ebr/collector.rs (6)
new(30-32)global_epoch(41-43)drop(91-95)drop(248-250)drop(313-315)drop(393-395)crates/common/src/ebr/epoch.rs (1)
new(90-93)crates/common/src/ebr/sync/once_lock.rs (2)
new(25-31)drop(98-103)crates/common/src/ebr/guard.rs (1)
drop(150-154)
crates/vm/src/warn.rs (6)
crates/vm/src/vm/mod.rs (1)
modules(990-990)crates/vm/src/builtins/type.rs (1)
__dict__(963-965)crates/vm/src/stdlib/builtins.rs (1)
globals(421-423)crates/vm/src/object/core.rs (1)
dict(732-734)crates/vm/src/builtins/dict.rs (1)
dict(782-782)crates/vm/src/frame.rs (1)
dict(801-801)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/builtins/str.rs (2)
try_traverse(1932-1934)try_pop_edges(1936-1938)crates/vm/src/object/traverse.rs (2)
try_traverse(19-19)try_pop_edges(21-21)
crates/stdlib/src/fcntl.rs (1)
extra_tests/snippets/stdlib_ctypes.py (1)
c_ulong(94-95)
crates/common/src/ebr/guard.rs (2)
crates/common/src/ebr/internal.rs (4)
defer(349-359)new(68-70)new(160-166)pin(392-454)crates/common/src/ebr/deferred.rs (5)
mem(33-33)mem(34-34)mem(37-37)mem(37-37)new(32-70)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
debug_obj(87-93)drop_dealloc_obj(84-86)try_pop_edges_obj(103-106)try_trace_obj(96-100)
crates/vm/src/builtins/function.rs (2)
crates/vm/src/builtins/list.rs (2)
traverse(62-64)pop_edges(66-73)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)
crates/common/src/ebr/sync/queue.rs (1)
crates/common/src/ebr/guard.rs (1)
unprotected(173-177)
crates/common/src/ebr/collector.rs (2)
crates/common/src/ebr/internal.rs (9)
fmt(117-119)default(101-103)new(68-70)new(160-166)register(300-323)pin(392-454)drop(107-112)release_handle(517-526)collector(333-335)crates/common/src/ebr/guard.rs (2)
fmt(158-160)drop(150-154)
crates/vm/src/vm/thread.rs (1)
crates/common/src/ebr/default.rs (1)
cs(24-26)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
ensure_pinned(30-36)drop_guard(55-59)
crates/stdlib/src/ssl.rs (2)
crates/vm/src/builtins/function.rs (1)
traverse(47-58)crates/vm/src/builtins/dict.rs (1)
traverse(48-50)
crates/common/src/ebr/mod.rs (5)
crates/common/src/ebr/internal.rs (2)
collector(333-335)default(101-103)crates/common/src/ebr/default.rs (1)
collector(11-15)crates/common/src/ebr/sync/list.rs (1)
default(133-137)crates/common/src/ebr/collector.rs (1)
default(21-25)crates/common/src/ebr/pointers.rs (1)
default(29-31)
crates/vm/src/builtins/tuple.rs (3)
crates/vm/src/builtins/list.rs (2)
traverse(62-64)pop_edges(66-73)crates/vm/src/builtins/dict.rs (2)
traverse(48-50)pop_edges(52-58)crates/vm/src/object/traverse.rs (10)
traverse(34-34)traverse(43-45)traverse(49-51)traverse(55-55)traverse(60-64)traverse(72-76)traverse(84-88)traverse(96-100)traverse(105-112)pop_edges(39-39)
crates/vm/src/object/traverse.rs (4)
crates/vm/src/builtins/str.rs (2)
try_traverse(1932-1934)try_pop_edges(1936-1938)crates/vm/src/builtins/list.rs (1)
pop_edges(66-73)crates/vm/src/builtins/dict.rs (1)
pop_edges(52-58)crates/vm/src/builtins/tuple.rs (1)
pop_edges(47-51)
crates/vm/src/builtins/str.rs (1)
crates/vm/src/object/traverse.rs (2)
try_traverse(19-19)try_pop_edges(21-21)
crates/common/src/ebr/epoch.rs (1)
crates/common/src/ebr/internal.rs (2)
new(68-70)new(160-166)
crates/common/src/refcount.rs (2)
crates/common/src/ebr/default.rs (2)
cs(24-26)global_epoch(44-46)crates/common/src/ebr/collector.rs (1)
global_epoch(41-43)
crates/vm/src/gc_state.rs (3)
crates/common/src/refcount.rs (2)
default(244-246)with_deferred_drops(396-413)crates/common/src/ebr/default.rs (1)
cs(24-26)crates/common/src/ebr/guard.rs (1)
drop(150-154)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
reactivate_guard(44-50)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (18)
gc_state(742-744)enable(164-166)disable(169-171)collect(368-370)get_threshold(184-190)set_threshold(64-66)set_threshold(193-201)get_count(204-210)get_debug(174-176)set_debug(179-181)get_stats(213-219)stats(68-75)get_objects(297-338)new(44-54)new(134-156)freeze(720-726)unfreeze(729-732)get_freeze_count(715-717)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Collect benchmark data
- GitHub Check: Collect what is left data
- GitHub Check: Collect code coverage data
- GitHub Check: Collect regression test data
🔇 Additional comments (73)
crates/derive-impl/src/util.rs (1)
375-375: LGTM! Thepop_edgesattribute is properly integrated throughout the codebase.The addition of "pop_edges" to ALLOWED_NAMES follows the pattern established by other attributes like "traverse" and enables
#[pyclass(pop_edges)]to pass validation. The macro implementation already checks for this attribute using_has_key("pop_edges")and generates the appropriateHAS_POP_EDGESconstant andtry_pop_edgesmethod based on its presence. Multiple builtin types (dict, list, tuple, function) already use this attribute with correspondingpop_edgesmethod implementations, and the GC system successfully leverages it to extract child references for circular reference resolution.crates/stdlib/src/fcntl.rs (1)
95-95: Conversion logic is correct; i64 choice is reasonable for the Python API.The truncation-then-zero-extend pattern correctly handles both positive values and negative i32 values passed as i64 (preserving their bit patterns). The i64 parameter type is appropriate for a Python API—it accepts a wider range of Python integer inputs while still safely truncating to u32 before passing to the syscall, which aligns with CPython's approach of accepting platform-sized integers for ioctl request codes. No breaking changes to internal Rust callers were found.
crates/common/Cargo.toml (1)
34-38: EBR dependency block looks consistent with existing workspace patternsThe added
crossbeam-utils,memoffset,scopeguard, andatomicworkspace dependencies are grouped clearly under an EBR comment and match the style of existing entries; no issues from the manifest side.crates/derive-impl/src/pyclass.rs (1)
608-643: MaybeTraverse generation fortraverse/pop_edgeslooks consistent with GC designThe new
MaybeTraverseimpl:
- Sets
IS_TRACEbased onhas_traverseand only callsTraverse::traversewhen requested.- Exposes a
HAS_POP_EDGESflag and routestry_pop_edgestoTraverse::pop_edgesonly whenpop_edgesis present.- Leaves both hooks as no‑ops when the attributes are absent, so all
#[pyclass]types implementMaybeTraverseuniformly.This matches the documented cases (manual vs derived traverse, optional pop_edges) and keeps the generated code straightforward; aside from the
_optional_strerror handling above, the structure looks solid.Also applies to: 699-706
crates/stdlib/src/ssl.rs (1)
2296-2345: GC traversal configuration for_SSLSocketlooks appropriateMarking
_SSLSocketas#[pyclass(..., traverse)]and selectively skipping fields with#[pytraverse(skip)]aligns well with what actually holds Python references:
- Traced:
sock,context,owner,session,incoming_bio,outgoing_bio,pending_context— all containPyObjectRef/PyRefand should participate in GC.- Skipped:
server_side,server_hostname,connection,handshake_done,session_was_reused,verified_chain,sni_state,client_hello_buffer,shutdown_state,deferred_cert_error— these are pure Rust state or TLS/certificate structures with no embedded Python objects.This should give the collector visibility into real Python edges (including the
_SSLSocket↔ owner/session relationships) without forcingTraverseimpls on the rustls plumbing types.crates/common/src/ebr/sync/once_lock.rs (1)
97-103: LGTM: Drop implementation is correct.The Drop implementation properly checks initialization before dropping the inner value, and uses
drop_in_placecorrectly through a raw pointer cast.crates/vm/src/object/core.rs (3)
870-888: Good: Proper finalization guard prevents double__del__calls.The
is_finalized/mark_finalizedpattern correctly prevents__del__from being called twice on resurrected objects. This aligns with CPython's behavior.
1290-1307: LGTM: GC tracking during object creation.The tracking logic correctly tracks objects with
IS_TRACEor instance dict, matching the untracking condition indrop_slow. Themaybe_collectcall enables automatic GC triggering.
831-856: Resurrection detection logic is sound.The
RefCount::inc()behavior is correctly implemented as documented: when called on strong count 0, it performs a double increment (0→2) by executing two separate atomic operations when the first increment observes strong count = 0. This is intentional for thread safety, as confirmed in the RefCount implementation (crates/common/src/refcount.rs:273-276).The resurrection detection correctly leverages this: after
inc(), the expected strong count is 2. Ifafter_delexceeds this, the__del__method created new references and the object is resurrected; the seconddec()is skipped to keep the object alive. Otherwise, both decrements run to return the count to 0, matching the double increment. The logic is correct.crates/stdlib/src/gc.rs (1)
50-86: LGTM:collectfunction correctly implements generation-based collection.The implementation properly validates generation bounds, invokes start/stop callbacks, forces collection even when GC is disabled (matching CPython behavior), and transfers garbage to the VM context.
crates/common/src/lib.rs (1)
17-17: LGTM: EBR module export.The new
ebrmodule is correctly added to the public API surface, enabling epoch-based reclamation functionality across the crate.crates/vm/src/lib.rs (1)
80-80: LGTM: GC state module export.The
gc_statemodule is correctly exposed as a public module, enabling GC functionality to be accessed from the stdlib and other components.crates/vm/src/builtins/str.rs (1)
1928-1938: LGTM: Correct traversal implementation forPyUtf8Str.
PyUtf8Strcorrectly implementsMaybeTraverse:
IS_TRACE = trueenables GC participationHAS_POP_EDGES = falseis correct since strings don't hold references to other Python objectstry_pop_edgesis appropriately a no-opcrates/vm/src/object/traverse.rs (3)
16-17: LGTM!The
HAS_POP_EDGESflag follows the same pattern asIS_TRACEand provides a clean compile-time capability check for the pop_edges functionality.
21-21: LGTM!The
try_pop_edgesmethod signature and default implementation are appropriate. The&mut selfparameter correctly allows implementations to take ownership of child references during GC cleanup.
36-39: LGTM!The
pop_edgesmethod is well-documented with clear semantics for circular reference resolution. The default no-op implementation is safe, and the relevant code snippets show that concrete types likePyDict,PyList, andPyTupleproperly implement this method to extract their owned child references.crates/vm/src/vm/context.rs (3)
55-57: LGTM!The new GC state fields are appropriately typed as
PyListRefand clearly documented. The public visibility is correct for VM-level GC integration.
336-338: LGTM!The initialization of
gc_callbacksandgc_garbagefollows the correct pattern used for other context objects likeempty_bytes. Both lists are properly tracked by the GC system viaPyRef::new_ref.
360-361: LGTM!The struct initialization correctly includes the new GC fields.
crates/common/src/ebr/sync/mod.rs (1)
1-5: LGTM!The synchronization primitives module structure is clean and appropriately scoped with
pub(crate)visibility for internal EBR use.crates/common/src/ebr/mod.rs (2)
1-17: LGTM!The module documentation provides a clear and comprehensive explanation of Epoch-Based Reclamation, including key concepts like pinning and critical sections.
27-30: LGTM!The re-exports provide a clean public API surface, making core EBR types and functions easily accessible through the module root.
crates/vm/src/builtins/super.rs (2)
64-64: LGTM!The type signature change from
PyTypeReftoPyObjectRefprovides more flexibility and aligns with Python's dynamic type system, wheresuper()can accept any object that is then checked at runtime.
78-81: LGTM!The downcast logic correctly validates that the first argument to
super()is a type object. The error message is clear and matches Python conventions, and the error handling viamap_erris appropriate.crates/vm/src/builtins/list.rs (2)
60-74: LGTM on the Traverse implementation.The
unsafe impl Traversecorrectly visits all ownedPyObjectRefs. Thepop_edgesimplementation usingtry_write()is appropriate for GC scenarios where the object should be unreachable.One minor observation: if
try_write()returnsNone(lock contention), elements won't be popped. The comment explains this is safe because objects should be unreachable during GC, but you may want to consider adding a debug assertion or logging for the failure case to catch unexpected contention during development.
27-33: Pyclass attributes look correct.The
traverse = "manual"andpop_edgesattributes properly indicate that this type provides a manualTraverseimplementation with edge-popping capability for GC, consistent with the pattern used forPyDictandPyTuple.crates/vm/src/vm/thread.rs (4)
15-20: Thread-local EBR guard storage looks correct.The
RefCell<Option<Guard>>pattern is appropriate for thread-local state that needs interior mutability. The doc comment clearly explains this is for coarse-grained pinning.
24-36: LGTM onensure_pinned().The function correctly uses
ebr::cs()to enter a critical section and stores the guard thread-locally. The idempotent check-then-set pattern is appropriate.
38-50: LGTM onreactivate_guard().Correctly calls
reactivate()on the existing guard to allow epoch advancement at safe points. The pattern of borrowing mutably only when a guard exists is efficient.
52-59: LGTM ondrop_guard().Setting the guard to
Nonewill drop theGuard, which unpins the thread as shown in the relevant snippet fromguard.rs(theDropimpl callslocal.unpin()).crates/vm/src/vm/mod.rs (2)
475-480: Correct CPython-aligned behavior for unraisable exceptions during shutdown.Suppressing unraisable exceptions when
finalizingistruematches CPython's behavior where daemon thread exceptions and__del__errors are silently ignored during interpreter shutdown.
533-538: Good placement for EBR guard reactivation.Frame boundaries are appropriate safe points for reactivating the EBR guard. This allows the GC to advance epochs and free deferred objects while ensuring no object references are held across the reactivation.
The
#[cfg(feature = "threading")]guard is correct since EBR is only relevant in threaded contexts.crates/vm/src/vm/interpreter.rs (2)
134-145: Good shutdown sequence for thread cleanup.Setting the
finalizingflag before callingthreading._shutdown()ensures that any exceptions from daemon threads during shutdown are suppressed. The error handling withlet _ = shutdown.call((), vm)silently ignores failures, which is appropriate during finalization.
149-158: Code pattern verified — bothgc_state()andcollect_force()are properly exposed and functional.The two-phase GC collection around module finalization is correct:
- First pass collects existing cycles before breaking module references
- Module cleanup breaks references to enable collection
- Second pass collects the newly unreachable cyclic garbage
Both
crate::gc_state::gc_state()(public function returning&'static GcState) andcollect_force()(public method accepting generation parameter) are properly defined and accessible.crates/derive-impl/src/pystructseq.rs (1)
610-622: LGTM on MaybeTraverse implementation for struct sequences.The implementation correctly:
- Sets
IS_TRACE = trueto enable traversal- Sets
HAS_POP_EDGES = falsesince edge popping is delegated to the innerPyTuple- Provides a no-op
try_pop_edgeswith appropriate comment- Delegates
try_traverseto the inner tuple (line 616)This follows the same pattern as
PyUtf8Strand other wrapper types.crates/vm/src/builtins/tuple.rs (2)
40-52: Correct Traverse implementation for PyTuple.The implementation is sound:
traverseproperly visits all elements via delegationpop_edgesusesstd::mem::takewhich replacesself.elementswith an empty boxed slice and returns the original- Converting to
Vecviainto_vec()and extending is correctUnlike
PyList, no locking is needed here because tuples are conceptually immutable (the elements are fixed at construction), and during GC the object is unreachable anyway.
28-28: Pyclass attributes correctly configured.The
traverse = "manual"andpop_edgesattributes match the manualTraverseimplementation below.crates/common/src/ebr/default.rs (3)
33-41: Good fallback handling inwith_handle.The
try_withwithunwrap_or_elsefallback correctly handles the case where the thread-localHANDLEhas been destroyed (e.g., during thread exit). By registering a new temporary handle with the collector, the function remains functional even during thread teardown.This is validated by the
pin_while_exitingtest which demonstrates pinning afterHANDLEis dropped doesn't panic.
11-26: Clean EBR default module design.The module correctly implements lazy initialization:
OnceLockensures the globalCollectoris initialized exactly once- Thread-local
HANDLEis registered lazily on first use per threadcs()provides a simple API for entering critical sections
48-76: Good test coverage for edge case.The
pin_while_exitingtest validates that callingcs()during thread exit (afterHANDLEis dropped but before other thread-locals likeFOO) doesn't panic. This is an important safety property for the EBR system.crates/vm/src/object/traverse_object.rs (2)
17-19: LGTM! Documentation and signature are clear.The
pop_edgeshook is well-documented and correctly typed as an optional function pointer for extracting child references before deallocation.
34-40: LGTM! Conditional initialization follows the existingtracepattern.The const block pattern mirrors the
tracefield initialization, maintaining consistency in how optional vtable hooks are configured based on type traits.crates/common/src/ebr/deferred.rs (1)
74-77:callconsumesselfbut doesn't prevent double-drop.The
callmethod consumesself, which is correct. However, since there's noDropimpl, ensure thatDeferredinstances are always consumed viacall()and never dropped without invocation. The current test coverage suggests this is the intended usage pattern.crates/common/src/ebr/guard.rs (2)
116-136: Good use ofscopeguard::defer!for panic safety inreactivate_after.The implementation correctly uses
scopeguard::defer!to ensure the guard is re-pinned even if the provided function panics. Themem::forget(local.pin())pattern properly transfers ownership of the new guard's lifetime to the existingGuardstruct.
148-155: LGTM!Dropimplementation correctly handles null local.The
as_ref()pattern safely handles both regular guards and unprotected guards (with null local pointer).crates/common/src/ebr/epoch.rs (2)
16-20: LGTM! Clear documentation of the bit layout.The LSB-as-pinned-flag design is well-documented and the
data: usizerepresentation is appropriate for atomic operations.
62-70: LGTM!successorcorrectly preserves pin state.Adding 2 increments the epoch value (stored in bits 1+) while preserving the LSB (pin state). This is correct given the bit layout.
crates/common/src/ebr/sync/list.rs (4)
219-233:Dropusesunprotected()guard - verify thread safety assumptions.The
Dropimplementation usesunprotected()which bypasses EBR protection. This is only safe if:
- The list is not being concurrently accessed by other threads
- All elements have already been marked as deleted (checked by assertion)
The assertion on line 227 (
assert_eq!(succ.tag(), 1)) will panic if any element wasn't properly deleted before the list is dropped. Consider usingdebug_assert!or returning an error if this is a recoverable condition.
162-194: LGTM! Insert uses correct lock-free CAS pattern.The insert operation correctly:
- Reads current head
- Sets new entry's next to current head
- Uses
compare_exchange_weakin a retry loop- Uses appropriate memory orderings (Release for success, Relaxed for retry)
239-296: Iterator correctly handles concurrent modifications.The iterator properly:
- Detects deleted nodes via tag bit
- Attempts to unlink deleted nodes via CAS
- Restarts from head when predecessor is deleted
- Returns
IterError::Stalledto signal restartThe logic is correct for a lock-free list traversal with concurrent modifications.
306-321: TestIsElementimplementation uses raw pointer cast.The
finalizeimplementation correctly defers destruction via the guard. Theunsafeblock is properly scoped with a safety comment.crates/vm/src/builtins/dict.rs (1)
46-59: No action required. Thepop_all_entriesmethod exists and is correctly implemented indict_inner.rsat line 715. It returns an iterator of(PyObjectRef, T)pairs, which matches the expected usage inpop_edges. TheTraverseimplementation properly delegates toself.entries.traverseand correctly collects both keys and values inpop_edges.crates/vm/src/builtins/function.rs (1)
85-92: No safety issue withContext::genesis()during GC.The code is safe.
Context::genesis()uses Rust'sstatic_cellwithget_or_init()pattern, which is thread-safe and idempotent—it initializes exactly once and returns a&'staticreference. The method cannot panic, even on first access. Additionally, garbage collection only runs after theVirtualMachineis fully initialized, and VM initialization explicitly callsContext::genesis()before any object creation. Thepop_edgesmethod will always have a valid, initialized genesis context available.crates/vm/src/stdlib/thread.rs (1)
351-376: EBR pinning correctly brackets thread execution.The
ensure_pinned()at thread start anddrop_guard()at thread exit properly implements coarse-grained EBR pinning. This ensures the GC won't reclaim objects while the thread is active.One consideration: if the thread panics between
ensure_pinned()anddrop_guard(), the guard may not be dropped. However, sinceEBR_GUARDis thread-local, it will be cleaned up when the thread exits anyway.crates/common/src/ebr/collector.rs (2)
104-423: Comprehensive test coverage for EBR pinning and reclamation.The tests cover critical scenarios: reentrant pinning, local bag flushing, garbage buffering, epoch advancement under pinning, stress tests with multiple threads, and array destruction. Good use of
#[cfg(miri)]for reduced iteration counts during Miri testing.
15-16: ManualSendandSyncimplementations are necessary and appropriately justified.The
CollectorwrapsArc<Global>, which is not automaticallySend + Sync. This is evidenced by the#[allow(clippy::arc_with_non_send_sync)]lint suppression inCollector::default(), which the Rust clippy linter uses to warn whenArcwraps types that don't implementSendorSync. SinceGlobalcontainsList<Local>with intrusive pointers toLocalentries—which themselves containUnsafeCell—the manual implementations are correctly required to affirm thread-safety. These impls should remain in place.crates/common/src/ebr/sync/queue.rs (4)
40-42: ManualSendandSyncimpls are appropriate for the queue.The safety comment on line 40 is accurate: individual
Tvalues are not accessed concurrently since they're only read/written during push/pop which are atomic operations. TheSendbound onTis correctly required.
141-169:pop_if_internalreads data before CAS - potential for reading invalid data if condition races.The condition reads
n.data.as_ptr()at line 151 before the CAS at line 153. If another thread pops this node between the read and the CAS:
- The CAS will fail (returning
Err)- But the condition was evaluated on potentially stale/freed data
However, since the guard is held during this operation, EBR guarantees the memory won't be reclaimed until the guard is dropped. The
T: Syncbound ensures the read is safe.
201-213: Drop implementation correctly drains queue before destroying sentinel.The
unprotected()guard usage inDropis safe becausedrophas exclusive access (&mut self). The sentinel node is properly destroyed after draining all data nodes.
249-257: Test wrapper'spop()spins indefinitely - acceptable for tests but would deadlock on empty queue.The
popmethod in the test wrapper loops forever until an element is available. This is fine for test scenarios where producers are guaranteed to push, but the comment or implementation could note this is intentional for testing.crates/common/src/ebr/internal.rs (4)
134-141: SealedBag expiration uses 3-epoch distance - matches EBR guarantees.The
is_expiredcheck at line 139 ensures bags are only reclaimed whenglobal_epoch.wrapping_sub(self.epoch) >= 3. This is correct for EBR: a pinned thread can observe at most one epoch advancement, so 3 epochs guarantees all pinned threads have unpinned since the bag was sealed.
390-454: Local::pin implements correct EBR pinning with x86 optimization.The pinning logic:
- Increments
guard_count(lines 395-396)- On first pin (
guard_count == 0), stores pinned epoch with proper synchronization- Uses x86-specific
compare_exchangeoptimization (lines 406-434) for performance- Retries if global epoch advanced during pinning (lines 440-444)
- Resets advance counter on epoch change (lines 447-450)
The x86 hack using
compare_exchangeinstead ofmfenceis a known optimization from crossbeam-epoch.
456-480: Local::unpin triggers collection when last guard is dropped.The unpin logic at lines 460-470 triggers collection while still pinned (
guard_count == 1). This is intentional - collection requires being pinned to safely traverse the object graph. Thecollectingflag prevents reentrancy.Note: Line 472 decrements
guard_countafter the collection loop, which is correct - the ManuallyDrop guard at line 465 prevents double-decrement.
573-593: IsElement trait implementation uses offset_of! for intrusive list.The
entry_ofandelement_ofimplementations correctly convert betweenLocalandEntryusingoffset_of!. Thefinalizemethod defers destruction of theLocalnode.Line 590: The
RawShared::fromconversion from*const Localis safe because the caller guarantees the entry belongs to a validLocal.crates/common/src/ebr/pointers.rs (3)
62-72: High tag constants and bit manipulation are correct.
HIGH_TAG_WIDTH = 4uses the top 4 bits of the pointer. Thehigh_bits()andhigh_bits_pos()calculations are correct:
high_bits_pos() = 64 - 4 = 60(on 64-bit)high_bits() = 0xF << 60 = 0xF000_0000_0000_0000This reserves the top 4 bits for epoch tagging while preserving pointer validity in the middle bits.
148-156:low_bits<T>()correctly computes alignment-based tag mask.The function
(1 << align_of::<T>().trailing_zeros()) - 1computes the maximum value that fits in the alignment bits. For example, ifalign_of::<T>() == 8, this returns0b111 = 7.
258-306: RawShared provides safe wrappers around unsafe pointer operations.The
from_owned,drop,deref, andas_refmethods correctly document their safety requirements. The#[cfg(test)]onis_nullkeeps it out of production builds while available for testing.crates/common/src/refcount.rs (4)
301-312:dec()correctly handles LEAKED objects.The decrement returns
falsefor leaked objects (line 307-309), preventing deallocation of interned objects. The checkold.strong() == 1(line 311) correctly returns true only when this was the last reference.
382-448: Deferred drop infrastructure is well-designed for deadlock prevention.The thread-local
IN_DEFERRED_CONTEXTandDEFERRED_QUEUEpattern allows deferring untrack calls during GC collection. Key points:
with_deferred_dropssets context and flushes on exit (lines 400-412)try_defer_dropqueues if in context, else executes immediately (lines 425-433)flush_deferred_dropsdrains queue outside borrow (lines 440-447)The comment at lines 378-380 explains the deadlock scenario this prevents.
461-468:defer_destructionis marked unsafe but doesn't validate guard.The safety comment requires the guard to be valid, but there's no runtime check. This is acceptable since it's
unsafe, but consider adding a debug assertion.
264-277: Clarify the strong==0 case and document abort rationale.Line 273-276: The comment "The previous fetch_add created a permission to run decrement again" is cryptic and doesn't explain why a second
fetch_addis needed whenstrong == 0. This appears to handle a race condition with concurrent decrements, but the logic should be documented explicitly.Lines 268-272: The
abort()call on a destructed object is a severe failure (kills the process). Document why this is preferred overpanic!()orunreachable!(). Note thatsafe_inc()(lines 282-299) handles the destructed case more gracefully with compare-exchange; clarify which path is recommended and whyinc()takes the fatal approach.crates/vm/src/gc_state.rs (2)
85-92:GcObjectPtrmanual Send/Sync requires careful justification.The wrapper makes
NonNull<PyObject>Send+Sync. The safety comment "proper synchronization is used" is vague. The actual safety relies on:
- Objects being accessed through PyObjectRef which manages synchronization
- GcObjectPtr only being used as a key (for hashing/equality), not for dereferencing without synchronization
Consider strengthening the safety comment to document these invariants.
735-744: Global singleton pattern usingOnceLockis appropriate.The
GC_STATEsingleton withOnceLockprovides thread-safe lazy initialization. Thegc_state()accessor is clean and idiomatic.
| /// Reads the global epoch, without issueing a fence. | ||
| #[inline] | ||
| pub fn global_epoch(&self) -> Epoch { | ||
| self.global.epoch.load(Ordering::Relaxed) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in doc comment: "issueing" should be "issuing".
🔎 Proposed fix
- /// Reads the global epoch, without issueing a fence.
+ /// Reads the global epoch, without issuing a fence.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Reads the global epoch, without issueing a fence. | |
| #[inline] | |
| pub fn global_epoch(&self) -> Epoch { | |
| self.global.epoch.load(Ordering::Relaxed) | |
| } | |
| /// Reads the global epoch, without issuing a fence. | |
| #[inline] | |
| pub fn global_epoch(&self) -> Epoch { | |
| self.global.epoch.load(Ordering::Relaxed) | |
| } |
🤖 Prompt for AI Agents
In crates/common/src/ebr/collector.rs around lines 39 to 43, the doc comment for
global_epoch contains a typo: "issueing" should be corrected to "issuing";
update the doc comment to read "Reads the global epoch, without issuing a
fence." so the documentation is spelled correctly.
| pub(crate) struct Deferred { | ||
| call: unsafe fn(*mut u8), | ||
| data: MaybeUninit<Data>, | ||
| _marker: PhantomData<*mut ()>, // !Send + !Sync | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing Drop implementation may cause memory leaks.
Deferred stores a FnOnce() closure inline or boxed, but has no Drop implementation. If a Deferred is dropped without call() being invoked, the stored closure (and any captured data) will be leaked. This is particularly concerning for the heap-allocated path where a Box<F> is stored.
Consider adding a Drop impl that invokes the closure or at least drops the stored data, unless leaking is intentional for the EBR design (in which case, document this behavior).
🔎 Proposed Drop implementation
+impl Drop for Deferred {
+ fn drop(&mut self) {
+ // Call the deferred function to ensure it's not leaked
+ // This is safe because we own self
+ let call = self.call;
+ unsafe { call(self.data.as_mut_ptr().cast::<u8>()) };
+ }
+}Alternatively, if leaking is intentional, add documentation explaining why.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/common/src/ebr/deferred.rs around lines 18-22, Deferred currently
holds a closure (inline or boxed) but has no Drop implementation, so if a
Deferred is dropped without call() being invoked the stored closure/data will be
leaked; add a Drop impl that safely releases stored resources: on drop, check
the active variant/state and either invoke the stored destructor function (the
unsafe fn(*mut u8)) against the data buffer to run/drop the closure or, if
invoking is unsafe for EBR semantics, explicitly run the correct destructor for
the boxed path and drop the inline data to avoid leaks; ensure the Drop uses the
same unsafe logic as call(), avoids double-running if call() was already
executed (e.g. by clearing a flag or setting data to a neutral state), and
documents the chosen behavior if intentional leaking is preserved.
| /// # Safety | ||
| /// | ||
| /// The given function must not hold reference onto the stack. It is highly recommended that | ||
| /// the passed function is **always** marked with `move` in order to prevent accidental | ||
| /// borrows. | ||
| /// | ||
| /// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by | ||
| /// the closure must be `Send`. | ||
| /// # Safety | ||
| /// | ||
| /// The given function must not hold reference onto the stack. It is highly recommended that | ||
| /// the passed function is **always** marked with `move` in order to prevent accidental | ||
| /// borrows. The closure must be `Send` as another thread may execute it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate safety documentation.
The safety documentation for defer_unchecked is duplicated (lines 32-39 and 40-44). Remove the duplicate section.
🔎 Proposed fix
/// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
/// the closure must be `Send`.
- /// # Safety
- ///
- /// The given function must not hold reference onto the stack. It is highly recommended that
- /// the passed function is **always** marked with `move` in order to prevent accidental
- /// borrows. The closure must be `Send` as another thread may execute it.
pub unsafe fn defer_unchecked<F, R>(&self, f: F)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// # Safety | |
| /// | |
| /// The given function must not hold reference onto the stack. It is highly recommended that | |
| /// the passed function is **always** marked with `move` in order to prevent accidental | |
| /// borrows. | |
| /// | |
| /// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by | |
| /// the closure must be `Send`. | |
| /// # Safety | |
| /// | |
| /// The given function must not hold reference onto the stack. It is highly recommended that | |
| /// the passed function is **always** marked with `move` in order to prevent accidental | |
| /// borrows. The closure must be `Send` as another thread may execute it. | |
| /// # Safety | |
| /// | |
| /// The given function must not hold reference onto the stack. It is highly recommended that | |
| /// the passed function is **always** marked with `move` in order to prevent accidental | |
| /// borrows. | |
| /// | |
| /// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by | |
| /// the closure must be `Send`. | |
| pub unsafe fn defer_unchecked<F, R>(&self, f: F) |
🤖 Prompt for AI Agents
crates/common/src/ebr/guard.rs lines 32-44: the safety documentation for
defer_unchecked is duplicated across two blocks; remove the redundant copy so
only a single Safety section remains, keeping the more concise/accurate wording
(ensure it states that the closure must not hold stack references, should be
marked move to avoid borrows, and must be Send), and adjust surrounding comment
spacing to preserve formatting.
| pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> { | ||
| // HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`. | ||
| // The equality of the sizes is checked by `const_assert!`. | ||
| let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) }; | ||
| let prev = inner.fetch_or(low_bits::<T>() & tag, order); | ||
| RawShared::from(prev as *const _) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fetch_or uses unsafe transmute to AtomicUsize - missing size assertion.
The comment mentions "The equality of the sizes is checked by const_assert!" but no const_assert! is present in this file. This is a latent bug if Atomic<Tagged<T>> ever differs in size from AtomicUsize.
🔎 Add the missing size assertion
+use static_assertions::const_assert_eq;
+const_assert_eq!(core::mem::size_of::<Atomic<Tagged<()>>>(), core::mem::size_of::<AtomicUsize>());
+
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
// HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
// The equality of the sizes is checked by `const_assert!`.Alternatively, add a debug_assert at runtime:
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
+ debug_assert_eq!(
+ core::mem::size_of::<Atomic<Tagged<T>>>(),
+ core::mem::size_of::<AtomicUsize>()
+ );
let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> { | |
| // HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`. | |
| // The equality of the sizes is checked by `const_assert!`. | |
| let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) }; | |
| let prev = inner.fetch_or(low_bits::<T>() & tag, order); | |
| RawShared::from(prev as *const _) | |
| } | |
| } | |
| pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> { | |
| debug_assert_eq!( | |
| core::mem::size_of::<Atomic<Tagged<T>>>(), | |
| core::mem::size_of::<AtomicUsize>() | |
| ); | |
| // HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`. | |
| // The equality of the sizes is checked by `const_assert!`. | |
| let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) }; | |
| let prev = inner.fetch_or(low_bits::<T>() & tag, order); | |
| RawShared::from(prev as *const _) | |
| } | |
| } |
| /// State wraps reference count + flags in a single 64-bit word | ||
| #[derive(Clone, Copy)] | ||
| pub struct State { | ||
| inner: u64, | ||
| } | ||
|
|
||
| impl State { | ||
| #[inline] | ||
| pub fn from_raw(inner: u64) -> Self { | ||
| Self { inner } | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn as_raw(self) -> u64 { | ||
| self.inner | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn strong(self) -> u32 { | ||
| ((self.inner & STRONG) / COUNT) as u32 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn weak(self) -> u32 { | ||
| ((self.inner & WEAK) / WEAK_COUNT) as u32 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn destructed(self) -> bool { | ||
| (self.inner & DESTRUCTED) != 0 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn weaked(self) -> bool { | ||
| (self.inner & WEAKED) != 0 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn leaked(self) -> bool { | ||
| (self.inner & LEAKED) != 0 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn epoch(self) -> u32 { | ||
| ((self.inner & EPOCH) >> EPOCH_MASK_HEIGHT) as u32 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn with_epoch(self, epoch: usize) -> Self { | ||
| Self::from_raw((self.inner & !EPOCH) | (((epoch as u64) << EPOCH_MASK_HEIGHT) & EPOCH)) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn add_strong(self, val: u32) -> Self { | ||
| Self::from_raw(self.inner + (val as u64) * COUNT) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn sub_strong(self, val: u32) -> Self { | ||
| debug_assert!(self.strong() >= val); | ||
| Self::from_raw(self.inner - (val as u64) * COUNT) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn add_weak(self, val: u32) -> Self { | ||
| Self::from_raw(self.inner + (val as u64) * WEAK_COUNT) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn with_destructed(self, dest: bool) -> Self { | ||
| Self::from_raw((self.inner & !DESTRUCTED) | if dest { DESTRUCTED } else { 0 }) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn with_weaked(self, weaked: bool) -> Self { | ||
| Self::from_raw((self.inner & !WEAKED) | if weaked { WEAKED } else { 0 }) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn with_leaked(self, leaked: bool) -> Self { | ||
| Self::from_raw((self.inner & !LEAKED) | if leaked { LEAKED } else { 0 }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
State and PyState are identical - consider removing duplication.
PyState (lines 151-233) has the exact same fields and methods as State (lines 33-115). This violates DRY and creates maintenance burden. If they need to diverge in the future, document why. Otherwise, use a single type or a type alias.
🔎 Proposed fix - use type alias or single type
-/// PyState extends State with LEAKED support for RustPython
-#[derive(Clone, Copy)]
-pub struct PyState {
- inner: u64,
-}
-
-impl PyState {
- // ... 80+ lines of duplicated methods ...
-}
+/// PyState is an alias for State, used in RustPython integration
+pub type PyState = State;If distinct types are truly needed for type safety, consider a generic or trait-based approach.
Also applies to: 151-233
🤖 Prompt for AI Agents
crates/common/src/refcount.rs lines 33-115 (and PyState at 151-233): the State
and PyState implementations are identical, so remove duplication by
consolidating into a single definition; either replace PyState with a type alias
(pub type PyState = State) and remove the duplicate struct/impl, or keep one
struct (State) and extract shared methods into a trait/generic used by both
names, updating all call sites to use the unified type; ensure visibility (pub)
and any module exports are preserved and add a short comment if you
intentionally keep a separate alias for future divergence.
| const ALLOWED_TRAVERSE_OPTS: &[&str] = &["manual"]; | ||
| // try to know if it have a `#[pyclass(trace)]` exist on this struct | ||
| // TODO(discord9): rethink on auto detect `#[Derive(PyTrace)]` | ||
|
|
||
| // 1. no `traverse` at all: generate a dummy try_traverse | ||
| // 2. `traverse = "manual"`: generate a try_traverse, but not #[derive(Traverse)] | ||
| // 3. `traverse`: generate a try_traverse, and #[derive(Traverse)] | ||
| let (maybe_trace_code, derive_trace) = { | ||
| if class_meta.inner()._has_key("traverse")? { | ||
| let maybe_trace_code = quote! { | ||
| impl ::rustpython_vm::object::MaybeTraverse for #ident { | ||
| const IS_TRACE: bool = true; | ||
| fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) { | ||
| ::rustpython_vm::object::Traverse::traverse(self, tracer_fn); | ||
| } | ||
| // Generate MaybeTraverse impl with both traverse and pop_edges support | ||
| // | ||
| // For traverse: | ||
| // 1. no `traverse` at all: IS_TRACE = false, try_traverse does nothing | ||
| // 2. `traverse = "manual"`: IS_TRACE = true, but no #[derive(Traverse)] | ||
| // 3. `traverse`: IS_TRACE = true, and #[derive(Traverse)] | ||
| // | ||
| // For pop_edges: | ||
| // 1. no `pop_edges`: HAS_POP_EDGES = false, try_pop_edges does nothing | ||
| // 2. `pop_edges`: HAS_POP_EDGES = true, try_pop_edges calls Traverse::pop_edges | ||
| let has_traverse = class_meta.inner()._has_key("traverse")?; | ||
| let has_pop_edges = class_meta.inner()._has_key("pop_edges")?; | ||
|
|
||
| let derive_trace = if has_traverse { | ||
| let value = class_meta.inner()._optional_str("traverse"); | ||
| if let Ok(Some(s)) = value { | ||
| if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) { | ||
| bail_span!( | ||
| item, | ||
| "traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all", | ||
| ); | ||
| } | ||
| assert_eq!(s, "manual"); | ||
| quote! {} | ||
| } else { | ||
| quote! {#[derive(Traverse)]} | ||
| } | ||
| } else { | ||
| quote! {} | ||
| }; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t ignore errors from _optional_str("traverse") when deriving Traverse
The new derive_trace logic is almost there, but this pattern:
let value = class_meta.inner()._optional_str("traverse");
if let Ok(Some(s)) = value {
// ...
} else {
quote! {#[derive(Traverse)]}
}silently treats any Err(_) from _optional_str as “no value, derive Traverse”, rather than surfacing a compile‑time error. That means malformed uses like #[pyclass(traverse(123))] could compile and unexpectedly auto‑derive Traverse instead of failing in the macro.
You can preserve the intended behavior while propagating parse errors by binding the Result with ? first:
Proposed fix for error propagation in traverse handling
- let derive_trace = if has_traverse {
- let value = class_meta.inner()._optional_str("traverse");
- if let Ok(Some(s)) = value {
+ let derive_trace = if has_traverse {
+ let value = class_meta.inner()._optional_str("traverse")?;
+ if let Some(s) = value {
if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) {
bail_span!(
item,
"traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all",
);
}
assert_eq!(s, "manual");
quote! {}
} else {
quote! {#[derive(Traverse)]}
}
} else {
quote! {}
};🤖 Prompt for AI Agents
crates/derive-impl/src/pyclass.rs lines 576-607: the code currently ignores Err
from _optional_str("traverse") which can hide parse errors; change to propagate
the Result by using let value = class_meta.inner()._optional_str("traverse")?;
then match on value (Some(s) => validate s is "manual" and return quote!{} for
manual, None => return quote!{#[derive(Traverse)]}) so any Err is returned to
the caller instead of silently treating it as None.
| /// Return the list of objects that directly refer to any of the arguments. | ||
| #[pyfunction] | ||
| fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult { | ||
| Err(vm.new_not_implemented_error("")) | ||
| fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef { | ||
| // This is expensive: we need to scan all tracked objects | ||
| // For now, return an empty list (would need full object tracking to implement) | ||
| let _ = args; | ||
| vm.ctx.new_list(vec![]) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete implementation: get_referrers always returns an empty list.
This function is documented to "Return the list of objects that directly refer to any of the arguments" but always returns an empty list. This could cause subtle bugs in code relying on this CPython feature.
Consider either:
- Implementing it properly by scanning all tracked objects
- Raising
NotImplementedErrorto make the limitation explicit - Adding a clear warning in the docstring
+ /// Return the list of objects that directly refer to any of the arguments.
+ ///
+ /// NOTE: This is currently not implemented and always returns an empty list.
+ /// Implementing this requires scanning all tracked objects which is expensive.
#[pyfunction]
fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
- // This is expensive: we need to scan all tracked objects
- // For now, return an empty list (would need full object tracking to implement)
let _ = args;
+ // TODO: Implement by iterating gc_state().get_objects(None) and checking referents
vm.ctx.new_list(vec![])
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Return the list of objects that directly refer to any of the arguments. | |
| #[pyfunction] | |
| fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult { | |
| Err(vm.new_not_implemented_error("")) | |
| fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef { | |
| // This is expensive: we need to scan all tracked objects | |
| // For now, return an empty list (would need full object tracking to implement) | |
| let _ = args; | |
| vm.ctx.new_list(vec![]) | |
| } | |
| /// Return the list of objects that directly refer to any of the arguments. | |
| /// | |
| /// NOTE: This is currently not implemented and always returns an empty list. | |
| /// Implementing this requires scanning all tracked objects which is expensive. | |
| #[pyfunction] | |
| fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef { | |
| let _ = args; | |
| // TODO: Implement by iterating gc_state().get_objects(None) and checking referents | |
| vm.ctx.new_list(vec![]) | |
| } |
🤖 Prompt for AI Agents
In crates/stdlib/src/gc.rs around lines 191 to 198, get_referrers currently
returns an empty list which misrepresents its documented behavior; change it to
explicitly signal the limitation by returning/raising a NotImplementedError (or
the project-equivalent VM exception) instead of an empty list and update the
docstring to mention that get_referrers is not implemented on this platform;
ensure the function uses the VM to construct and return the proper exception so
callers receive an explicit error rather than an empty result.
| pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) { | ||
| let gc_ptr = GcObjectPtr(obj); | ||
|
|
||
| // Remove from all generation tracking lists | ||
| for generation in &self.generation_objects { | ||
| if let Ok(mut gen_set) = generation.write() { | ||
| if gen_set.remove(&gc_ptr) { | ||
| break; // Object can only be in one generation | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Update counts (simplified - just decrement gen0 for now) | ||
| let count = self.generations[0].count.load(Ordering::SeqCst); | ||
| if count > 0 { | ||
| self.generations[0].count.fetch_sub(1, Ordering::SeqCst); | ||
| } | ||
|
|
||
| // Remove from global tracking | ||
| if let Ok(mut tracked) = self.tracked_objects.write() { | ||
| tracked.remove(&gc_ptr); | ||
| } | ||
|
|
||
| // Remove from finalized set | ||
| if let Ok(mut finalized) = self.finalized_objects.write() { | ||
| finalized.remove(&gc_ptr); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
untrack_object always decrements gen0 count - incorrect for objects in other generations.
Lines 259-263 always decrement generations[0].count, even though the object might be in gen1 or gen2. This will cause count underflow/mismatch over time.
🔎 Proposed fix - track which generation the object was in
pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) {
let gc_ptr = GcObjectPtr(obj);
+ let mut found_gen: Option<usize> = None;
// Remove from all generation tracking lists
- for generation in &self.generation_objects {
+ for (gen_idx, generation) in self.generation_objects.iter().enumerate() {
if let Ok(mut gen_set) = generation.write() {
if gen_set.remove(&gc_ptr) {
+ found_gen = Some(gen_idx);
break; // Object can only be in one generation
}
}
}
- // Update counts (simplified - just decrement gen0 for now)
- let count = self.generations[0].count.load(Ordering::SeqCst);
- if count > 0 {
- self.generations[0].count.fetch_sub(1, Ordering::SeqCst);
+ // Decrement the correct generation's count
+ if let Some(gen_idx) = found_gen {
+ self.generations[gen_idx].count.fetch_sub(1, Ordering::SeqCst);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) { | |
| let gc_ptr = GcObjectPtr(obj); | |
| // Remove from all generation tracking lists | |
| for generation in &self.generation_objects { | |
| if let Ok(mut gen_set) = generation.write() { | |
| if gen_set.remove(&gc_ptr) { | |
| break; // Object can only be in one generation | |
| } | |
| } | |
| } | |
| // Update counts (simplified - just decrement gen0 for now) | |
| let count = self.generations[0].count.load(Ordering::SeqCst); | |
| if count > 0 { | |
| self.generations[0].count.fetch_sub(1, Ordering::SeqCst); | |
| } | |
| // Remove from global tracking | |
| if let Ok(mut tracked) = self.tracked_objects.write() { | |
| tracked.remove(&gc_ptr); | |
| } | |
| // Remove from finalized set | |
| if let Ok(mut finalized) = self.finalized_objects.write() { | |
| finalized.remove(&gc_ptr); | |
| } | |
| } | |
| pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) { | |
| let gc_ptr = GcObjectPtr(obj); | |
| let mut found_gen: Option<usize> = None; | |
| // Remove from all generation tracking lists | |
| for (gen_idx, generation) in self.generation_objects.iter().enumerate() { | |
| if let Ok(mut gen_set) = generation.write() { | |
| if gen_set.remove(&gc_ptr) { | |
| found_gen = Some(gen_idx); | |
| break; // Object can only be in one generation | |
| } | |
| } | |
| } | |
| // Decrement the correct generation's count | |
| if let Some(gen_idx) = found_gen { | |
| self.generations[gen_idx].count.fetch_sub(1, Ordering::SeqCst); | |
| } | |
| // Remove from global tracking | |
| if let Ok(mut tracked) = self.tracked_objects.write() { | |
| tracked.remove(&gc_ptr); | |
| } | |
| // Remove from finalized set | |
| if let Ok(mut finalized) = self.finalized_objects.write() { | |
| finalized.remove(&gc_ptr); | |
| } | |
| } |
| pub unsafe fn gc_pop_edges(&self) -> Vec<PyObjectRef> { | ||
| // SAFETY: During GC collection, this object is unreachable (gc_refs == 0), | ||
| // meaning no other code has a reference to it. The only references are | ||
| // internal cycle references which we're about to break. | ||
| unsafe { Self::gc_pop_edges_raw(self as *const _ as *mut PyObject) } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential undefined behavior: casting &self to *mut violates aliasing rules.
gc_pop_edges takes &self but immediately casts to *mut PyObject. Even with the safety comment about exclusive access during GC, creating a mutable pointer from a shared reference can violate Rust's aliasing rules and trigger UB under strict interpretations.
Consider either:
- Using
UnsafeCellin the payload types that need mutation during GC - Taking
*mut PyObjectdirectly in the public API - Using
&mut selfif exclusive access is truly guaranteed
🔎 Suggested fix
- pub unsafe fn gc_pop_edges(&self) -> Vec<PyObjectRef> {
- // SAFETY: During GC collection, this object is unreachable (gc_refs == 0),
- // meaning no other code has a reference to it. The only references are
- // internal cycle references which we're about to break.
- unsafe { Self::gc_pop_edges_raw(self as *const _ as *mut PyObject) }
- }
+ /// Pop edges from this object for cycle breaking.
+ /// Prefer using gc_pop_edges_raw with a pointer obtained safely.
+ pub unsafe fn gc_pop_edges_raw(ptr: *mut PyObject) -> Vec<PyObjectRef> {
+ let mut result = Vec::new();
+ let obj = unsafe { &*ptr };
+ if let Some(pop_edges_fn) = obj.0.vtable.pop_edges {
+ unsafe { pop_edges_fn(ptr, &mut result) };
+ }
+ result
+ }Remove the &self variant and use only the raw pointer version.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/vm/src/object/core.rs around lines 1064 to 1069, casting the shared
reference &self to *mut PyObject can violate Rust aliasing rules and cause
undefined behavior; change the API to avoid creating a mutable pointer from
&self by either (a) changing this method to take *mut PyObject (delegate to
gc_pop_edges_raw directly) or (b) accept &mut self when exclusive access is
guaranteed, or (c) make the interior mutability explicit with UnsafeCell for the
fields mutated during GC; update callers to pass the raw pointer or mutable
reference accordingly and remove the &self wrapper overload if you adopt the
raw-pointer approach.
| let (globals, filename, lineno) = if let Some(f) = f { | ||
| (f.globals.clone(), f.code.source_path, f.f_lineno()) | ||
| } else if let Some(frame) = vm.current_frame() { | ||
| // We have a frame but it wasn't found during stack walking | ||
| (frame.globals.clone(), vm.ctx.intern_str("sys"), 1) | ||
| } else { | ||
| (vm.current_globals().clone(), vm.ctx.intern_str("sys"), 1) | ||
| // No frames on the stack - we're in interpreter shutdown or similar state | ||
| // Use __main__ globals if available, otherwise skip the warning | ||
| match vm | ||
| .sys_module | ||
| .get_attr(identifier!(vm, modules), vm) | ||
| .and_then(|modules| modules.get_item(vm.ctx.intern_str("__main__"), vm)) | ||
| .and_then(|main_mod| main_mod.get_attr(identifier!(vm, __dict__), vm)) | ||
| { | ||
| Ok(globals) => { | ||
| if let Ok(dict) = globals.downcast::<crate::builtins::PyDict>() { | ||
| (dict, vm.ctx.intern_str("sys"), 1) | ||
| } else { | ||
| // Cannot get globals, skip warning | ||
| return Ok(( | ||
| vm.ctx.intern_str("sys").to_owned(), | ||
| 1, | ||
| None, | ||
| vm.ctx.new_dict().into(), | ||
| )); | ||
| } | ||
| } | ||
| Err(_) => { | ||
| // Cannot get __main__ globals, skip warning | ||
| return Ok(( | ||
| vm.ctx.intern_str("sys").to_owned(), | ||
| 1, | ||
| None, | ||
| vm.ctx.new_dict().into(), | ||
| )); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shutdown fallback currently still emits warnings despite “skip warning” comment
In the final else branch of setup_context, both “Cannot get globals, skip warning” comments suggest the intent is to suppress the warning when there are no frames / __main__ globals. However, the function returns a normal (filename, lineno, module = None, registry = new_dict()) tuple; warn then still calls warn_explicit, which normalizes module from filename and proceeds to emit the warning.
If the goal is to truly skip warnings in these shutdown-like states, you’ll need a distinct signal (e.g., have setup_context return an Option and early‑return in warn, or return a sentinel registry/module that warn checks before calling warn_explicit). Otherwise, it’d be good to adjust or remove the “skip warning” comments to match the actual behavior and avoid confusion.
🤖 Prompt for AI Agents
In crates/vm/src/warn.rs around lines 405-441, the shutdown fallback branches
claim “skip warning” but still return a normal tuple so warn continues and emits
warnings; change the API so setup_context can signal “skip” (e.g., return
Option<(filename, lineno, module, registry)> or a Result with a distinct
variant) and update warn to early-return when setup_context returns None (or the
skip variant) so no warn_explicit is called during
interpreter-shutdown/no-globals cases; alternatively, if you prefer not to
change the signature, return a clear sentinel (like module
Some(vm.ctx.intern_str("__shutdown__")) or an empty registry flagged with a
special key) and add a check in warn to detect that sentinel and return
early—implement one of these approaches and remove or adjust the misleading
“skip warning” comments to match the new behavior.
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.