Skip to content

Conversation

@youknowone
Copy link
Owner

@youknowone youknowone commented Jan 16, 2026

Summary by CodeRabbit

New Features

  • Enhanced object serialization with improved state management and detection of custom initialization arguments
  • Added pickle support for I/O objects, enabling serialization and restoration of file-like objects with their content, stream position, and configuration
  • Improved serialization protocol support for complex object types and reconstruction scenarios

✏️ Tip: You can customize this high-level summary in your review settings.

@youknowone youknowone marked this pull request as draft January 16, 2026 12:35
@coderabbitai
Copy link

coderabbitai bot commented Jan 16, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This change implements enhanced Python pickle protocol support (protocol >= 2) by refactoring object state serialization logic in the core VM, adding getstate/setstate methods to five I/O classes for custom pickle handling, and introducing getnewargs_ex constant support for advanced object reconstruction.

Changes

Cohort / File(s) Summary
Core pickle protocol refactoring
crates/vm/src/builtins/object.rs
Reworks object state serialization with new helpers: is_getstate_overridden detects getstate overrides; get_new_arguments derives new args from getnewargs_ex or getnewargs; reduce_newobj builds protocol >= 2 reduce tuples via copyreg.newobj; get_items_iter produces iterators for list/dict items. Replaces placeholder itemsize check; computes expected basicsize; routes between default and user-provided getstate implementations. Updates common_reduce to call reduce_newobj for protocol >= 2.
I/O class pickle support
crates/vm/src/stdlib/io.rs
Adds getstate/setstate method pairs to five I/O classes (internal _io class, TextIOWrapper, StringIO, BytesIO) for custom serialization/deserialization of content, position, newline marker (where applicable), and instance dict. Includes guards for closed/detached streams. Exports PyDict in module's public imports.
VM context constant addition
crates/vm/src/vm/context.rs
Adds __getnewargs_ex__ field to ConstName struct and includes it in macro initialization for interned special method name support.

Sequence Diagram

sequenceDiagram
    participant Pickle as Pickle Protocol
    participant Reduce as common_reduce
    participant NewObj as reduce_newobj
    participant GetState as object_getstate
    participant Helpers as Helper Functions
    
    Pickle->>Reduce: call reduce (protocol >= 2)
    Reduce->>NewObj: call reduce_newobj
    NewObj->>Helpers: get_new_arguments
    Helpers-->>NewObj: (args, kwargs)
    NewObj->>GetState: object_getstate
    GetState->>Helpers: is_getstate_overridden?
    alt User override exists
        Helpers-->>GetState: call user __getstate__
    else Use default
        Helpers-->>GetState: call default implementation
    end
    GetState-->>NewObj: state
    NewObj->>Helpers: get_items_iter
    Helpers-->>NewObj: listitems, dictitems
    NewObj-->>Reduce: (newobj, newargs, state, listitems, dictitems)
    Reduce-->>Pickle: reduce tuple
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hops with glee through pickle's gate,
State and newargs now serialize straight,
I/O streams remember their way,
Protocol two wins the day! 🥒✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Implement pickle more' is vague and doesn't clearly specify what aspects of pickle functionality are being implemented or improved. Use a more descriptive title that specifies the pickle enhancement, such as 'Add pickle support for object serialization and I/O classes' or 'Implement getstate/setstate for core types'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@crates/vm/src/builtins/object.rs`:
- Around line 637-647: The code calls the retrieved __getstate__ attribute
unconditionally, which fails if a class sets __getstate__ = None; modify
object_getstate to, after obtaining getstate via obj.get_attr(identifier!(vm,
__getstate__), vm) and before calling getstate.call(...), check whether the
attribute is None (using the PyObject/Value API available in this
module/context) and if so return the None state (e.g.
Ok(vm.ctx.none()))—otherwise proceed to call getstate; keep the existing
fallback to object_getstate_default and existing error propagation from get_attr
intact.
- Around line 210-229: The basicsize calculation for pickle currently omits
space for automatic weakref storage, causing types with a "__weakref__" slot to
falsely fail the size check; update the block in object.rs (around basicsize,
obj.class(), slot_names) to detect if the type provides weakref storage and add
core::mem::size_of::<PyObjectRef>() to basicsize when present—do this by
checking the type's slots for a "__weakref__" entry (e.g., inspect slot_names
for "__weakref__" or use the heap-type extension/heap type metadata on
obj.class() to detect automatic weakref support) rather than relying on a
non-existent PyTypeFlags::HAS_WEAKREF, then re-run the existing comparison
against obj.class().slots.basicsize.

In `@crates/vm/src/stdlib/io.rs`:
- Around line 4311-4334: __setstate__ currently overwrites zelf.buffer.write()
without honoring active exports, bypassing the resize guard used by
write/truncate; change the code to acquire the same resizable/export guard used
by the write/truncate paths before replacing the buffer: use zelf.buffer(vm)?
(the same accessor used below) to obtain the buffer handle that enforces
exports/resize safety (the guard used by write/truncate), then replace the inner
BufferedIO (BufferedIO::new(Cursor::new(...))) via that guarded handle instead
of directly calling zelf.buffer.write(), so active exports are checked and the
buffer protocol invariants are preserved.
- Around line 4106-4127: The __setstate__ implementation mutates zelf.buffer via
write() before calling zelf.buffer(vm)? which performs the closed-object
validation; change the flow to validate first by calling zelf.buffer(vm)? to
obtain the validated guard and then perform mutations through that guard
(replace the BufferedIO via the guard and seek using the validated guard),
removing the earlier direct write() call and ensuring no mutation occurs before
the closed check.
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 746e71a and 1a1d484.

📒 Files selected for processing (3)
  • crates/vm/src/builtins/object.rs
  • crates/vm/src/stdlib/io.rs
  • crates/vm/src/vm/context.rs
🧰 Additional context used
🧬 Code graph analysis (2)
crates/vm/src/builtins/object.rs (2)
crates/vm/src/builtins/tuple.rs (4)
  • class (41-43)
  • class (527-529)
  • len (194-196)
  • __getnewargs__ (352-362)
crates/vm/src/protocol/mapping.rs (1)
  • items (182-188)
crates/vm/src/stdlib/io.rs (1)
crates/vm/src/stdlib/itertools.rs (7)
  • __setstate__ (101-130)
  • __setstate__ (558-567)
  • __setstate__ (645-654)
  • __setstate__ (961-976)
  • __setstate__ (1111-1118)
  • __setstate__ (1364-1386)
  • __setstate__ (1866-1873)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Collect regression test data
  • GitHub Check: Collect benchmark data
  • GitHub Check: Collect code coverage data
  • GitHub Check: Collect what is left data
🔇 Additional comments (2)
crates/vm/src/vm/context.rs (1)

137-139: LGTM: interned magic name added cleanly.

This keeps the const-name set aligned with pickle protocol helpers.

crates/vm/src/stdlib/io.rs (1)

160-163: LGTM: PyDict import is appropriate.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 210 to 243
if required {
let mut basicsize = obj.class().slots.basicsize;
// if obj.class().slots.dict_offset > 0
// && !obj.class().slots.flags.has_feature(PyTypeFlags::MANAGED_DICT)
// {
// basicsize += std::mem::size_of::<PyObjectRef>();
// }
// if obj.class().slots.weaklist_offset > 0 {
// basicsize += std::mem::size_of::<PyObjectRef>();
// }
// Start with PyBaseObject_Type's basicsize
let mut basicsize = vm.ctx.types.object_type.slots.basicsize;

// Add __dict__ size if type has dict
if obj.class().slots.flags.has_feature(PyTypeFlags::HAS_DICT) {
basicsize += core::mem::size_of::<PyObjectRef>();
}

// Add slots size
if let Some(ref slot_names) = slot_names {
basicsize += core::mem::size_of::<PyObjectRef>() * slot_names.__len__();
}

// Fail if actual type's basicsize > expected basicsize
if obj.class().slots.basicsize > basicsize {
return Err(
vm.new_type_error(format!("cannot pickle {:.200} object", obj.class().name()))
vm.new_type_error(format!("cannot pickle '{}' object", obj.class().name()))
);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

❓ Verification inconclusive

Script executed:

#!/bin/bash
# Search for PyTypeFlags definitions and weakref-related code
rg -n "HAS_WEAKREF|HAS_DICT" crates/vm/src -g '*.rs' | head -20

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

crates/vm/src/vm/context.rs:515:            flags: PyTypeFlags::heap_type_flags() | PyTypeFlags::HAS_DICT,
crates/vm/src/vm/context.rs:628:            class.slots.flags.contains(PyTypeFlags::HAS_DICT),
crates/vm/src/types/slot.rs:228:        const HAS_DICT = 1 << 40;
crates/vm/src/object/payload.rs:66:        let dict = if cls.slots.flags.has_feature(PyTypeFlags::HAS_DICT) {
crates/vm/src/object/core.rs:636:            .has_feature(crate::types::PyTypeFlags::HAS_DICT)
crates/vm/src/exceptions.rs:577:    flags(BASETYPE, HAS_DICT)
crates/vm/src/exception_group.rs:18:        flags: PyTypeFlags::heap_type_flags() | PyTypeFlags::HAS_DICT,
crates/vm/src/class.rs:145:        if Self::TP_FLAGS.has_feature(PyTypeFlags::HAS_DICT) {
crates/vm/src/builtins/function.rs:470:    flags(HAS_DICT, METHOD_DESCRIPTOR)
crates/vm/src/builtins/module.rs:168:#[pyclass(with(GetAttr, Initializer, Representable), flags(BASETYPE, HAS_DICT))]
crates/vm/src/builtins/module.rs:195:                .has_feature(crate::types::PyTypeFlags::HAS_DICT)
crates/vm/src/builtins/object.rs:69:        // Only create dict if the class has HAS_DICT flag (i.e., __slots__ was not defined
crates/vm/src/builtins/object.rs:74:            .has_feature(crate::types::PyTypeFlags::HAS_DICT)
crates/vm/src/builtins/object.rs:215:        if obj.class().slots.flags.has_feature(PyTypeFlags::HAS_DICT) {
crates/vm/src/builtins/object.rs:447:                        |typ: &Py<PyType>| typ.slots.flags.has_feature(PyTypeFlags::HAS_DICT);
crates/vm/src/builtins/type.rs:302:        // Inherit HAS_DICT from any base in MRO that has it
crates/vm/src/builtins/type.rs:306:            .any(|b| b.slots.flags.has_feature(PyTypeFlags::HAS_DICT))
crates/vm/src/builtins/type.rs:308:            slots.flags |= PyTypeFlags::HAS_DICT
crates/vm/src/builtins/type.rs:368:        if base.slots.flags.has_feature(PyTypeFlags::HAS_DICT) {
crates/vm/src/builtins/type.rs:369:            slots.flags |= PyTypeFlags::HAS_DICT

Script executed:

#!/bin/bash
# Search for PyTypeFlags enum definition
rg -n "pub struct PyTypeFlags|pub enum.*Flags|HAS_WEAKREF|HAS_DICT" crates/vm/src -g '*.rs' -B 2 -A 10 | head -50

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

crates/vm/src/vm/context.rs-513-            name: interned_name.as_str(),
crates/vm/src/vm/context.rs-514-            basicsize: 0,
crates/vm/src/vm/context.rs:515:            flags: PyTypeFlags::heap_type_flags() | PyTypeFlags::HAS_DICT,
crates/vm/src/vm/context.rs-516-            ..PyTypeSlots::default()
crates/vm/src/vm/context.rs-517-        };
crates/vm/src/vm/context.rs-518-        PyType::new_heap(
crates/vm/src/vm/context.rs-519-            name,
crates/vm/src/vm/context.rs-520-            bases,
crates/vm/src/vm/context.rs-521-            attrs,
crates/vm/src/vm/context.rs-522-            slots,
crates/vm/src/vm/context.rs-523-            self.types.type_type.to_owned(),
crates/vm/src/vm/context.rs-524-            self,
crates/vm/src/vm/context.rs-525-        )
--
crates/vm/src/vm/context.rs-626-    pub fn new_base_object(&self, class: PyTypeRef, dict: Option<PyDictRef>) -> PyObjectRef {
crates/vm/src/vm/context.rs-627-        debug_assert_eq!(
crates/vm/src/vm/context.rs:628:            class.slots.flags.contains(PyTypeFlags::HAS_DICT),
crates/vm/src/vm/context.rs-629-            dict.is_some()
crates/vm/src/vm/context.rs-630-        );
crates/vm/src/vm/context.rs-631-        PyRef::new_ref(object::PyBaseObject, class, dict).into()
crates/vm/src/vm/context.rs-632-    }
crates/vm/src/vm/context.rs-633-
crates/vm/src/vm/context.rs-634-    pub fn new_code(&self, code: impl code::IntoCodeObject) -> PyRef<PyCode> {
crates/vm/src/vm/context.rs-635-        let code = code.into_code_object(self);
crates/vm/src/vm/context.rs-636-        PyRef::new_ref(PyCode { code }, self.types.code_type.to_owned(), None)
crates/vm/src/vm/context.rs-637-    }
crates/vm/src/vm/context.rs-638-}
--
crates/vm/src/types/slot.rs-213-    #[derive(Copy, Clone, Debug, PartialEq)]
crates/vm/src/types/slot.rs-214-    #[non_exhaustive]
crates/vm/src/types/slot.rs:215:    pub struct PyTypeFlags: u64 {
crates/vm/src/types/slot.rs-216-        const MANAGED_DICT = 1 << 4;
crates/vm/src/types/slot.rs-217-        const SEQUENCE = 1 << 5;
crates/vm/src/types/slot.rs-218-        const MAPPING = 1 << 6;
crates/vm/src/types/slot.rs-219-        const DISALLOW_INSTANTIATION = 1 << 7;
crates/vm/src/types/slot.rs-220-        const IMMUTABLETYPE = 1 << 8;
crates/vm/src/types/slot.rs-221-        const HEAPTYPE = 1 << 9;
crates/vm/src/types/slot.rs-222-        const BASETYPE = 1 << 10;
crates/vm/src/types/slot.rs-223-        const METHOD_DESCRIPTOR = 1 << 17;
crates/vm/src/types/slot.rs-224-        // For built-in types that match the subject itself in pattern matching
crates/vm/src/types/slot.rs-225-        // (bool, int, float, str, bytes, bytearray, list, tuple, dict, set, frozenset)
crates/vm/src/types/slot.rs-226-        // This is not a stable API
crates/vm/src/types/slot.rs-227-        const _MATCH_SELF = 1 << 22;
crates/vm/src/types/slot.rs:228:        const HAS_DICT = 1 << 40;
crates/vm/src/types/slot.rs-229-
crates/vm/src/types/slot.rs-230-        #[cfg(debug_assertions)]
crates/vm/src/types/slot.rs-231-        const _CREATED_WITH_FLAGS = 1 << 63;
crates/vm/src/types/slot.rs-232-    }
crates/vm/src/types/slot.rs-233-}
crates/vm/src/types/slot.rs-234-

Script executed:

#!/bin/bash
# Look at the exact implementation in object.rs around lines 210-229
sed -n '200,240p' crates/vm/src/builtins/object.rs

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

        // let state = object_get_dict(obj.clone(), obj.ctx()).unwrap();
        let Some(state) = obj.dict() else {
            return Ok(vm.ctx.none());
        };
        state.into()
    };

    let slot_names =
        type_slot_names(obj.class(), vm).map_err(|_| vm.new_type_error("cannot pickle object"))?;

    if required {
        // Start with PyBaseObject_Type's basicsize
        let mut basicsize = vm.ctx.types.object_type.slots.basicsize;

        // Add __dict__ size if type has dict
        if obj.class().slots.flags.has_feature(PyTypeFlags::HAS_DICT) {
            basicsize += core::mem::size_of::<PyObjectRef>();
        }

        // Add slots size
        if let Some(ref slot_names) = slot_names {
            basicsize += core::mem::size_of::<PyObjectRef>() * slot_names.__len__();
        }

        // Fail if actual type's basicsize > expected basicsize
        if obj.class().slots.basicsize > basicsize {
            return Err(
                vm.new_type_error(format!("cannot pickle '{}' object", obj.class().name()))
            );
        }
    }

    if let Some(slot_names) = slot_names {
        let slot_names_len = slot_names.__len__();
        if slot_names_len > 0 {
            let slots = vm.ctx.new_dict();
            for i in 0..slot_names_len {
                let borrowed_names = slot_names.borrow_vec();
                let name = borrowed_names[i].downcast_ref::<PyStr>().unwrap();
                let Ok(value) = obj.get_attr(name, vm) else {
                    continue;

Script executed:

#!/bin/bash
# Search for copyreg._slotnames and how it handles __weakref__
rg -n "_slotnames|slotnames" crates -g '*.rs' -g '*.py' -A 3 -B 1

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

crates/vm/src/vm/context.rs-219-    __slots__,
crates/vm/src/vm/context.rs:220:    __slotnames__,
crates/vm/src/vm/context.rs-221-    __str__,
crates/vm/src/vm/context.rs-222-    __sub__,
crates/vm/src/vm/context.rs-223-    __subclasscheck__,
--
crates/vm/src/builtins/object.rs-163-    // let attributes = typ.attributes.read();
crates/vm/src/builtins/object.rs:164:    // if let Some(slot_names) = attributes.get(identifier!(vm.ctx, __slotnames__)) {
crates/vm/src/builtins/object.rs-165-    //     return match_class!(match slot_names.clone() {
crates/vm/src/builtins/object.rs-166-    //         l @ super::PyList => Ok(Some(l)),
crates/vm/src/builtins/object.rs-167-    //         _n @ super::PyNone => Ok(None),
crates/vm/src/builtins/object.rs-168-    //         _ => Err(vm.new_type_error(format!(
crates/vm/src/builtins/object.rs:169:    //             "{:.200}.__slotnames__ should be a list or None, not {:.200}",
crates/vm/src/builtins/object.rs-170-    //             typ.name(),
crates/vm/src/builtins/object.rs-171-    //             slot_names.class().name()
crates/vm/src/builtins/object.rs-172-    //         ))),
--
crates/vm/src/builtins/object.rs-176-    let copyreg = vm.import("copyreg", 0)?;
crates/vm/src/builtins/object.rs:177:    let copyreg_slotnames = copyreg.get_attr("_slotnames", vm)?;
crates/vm/src/builtins/object.rs:178:    let slot_names = copyreg_slotnames.call((typ.to_owned(),), vm)?;
crates/vm/src/builtins/object.rs-179-    let result = match_class!(match slot_names {
crates/vm/src/builtins/object.rs-180-        l @ super::PyList => Some(l),
crates/vm/src/builtins/object.rs-181-        _n @ super::PyNone => None,
crates/vm/src/builtins/object.rs:182:        _ => return Err(vm.new_type_error("copyreg._slotnames didn't return a list or None")),
crates/vm/src/builtins/object.rs-183-    });
crates/vm/src/builtins/object.rs-184-    Ok(result)
crates/vm/src/builtins/object.rs-185-}
--
crates/vm/Lib/python_builtins/__reducelib.py-49-        state = getattr(obj, "__dict__", None)
crates/vm/Lib/python_builtins/__reducelib.py:50:        names = slotnames(cls)  # not checking for list
crates/vm/Lib/python_builtins/__reducelib.py-51-        if names is not None:
crates/vm/Lib/python_builtins/__reducelib.py-52-            slots = {}
crates/vm/Lib/python_builtins/__reducelib.py-53-            for name in names:
--
crates/vm/Lib/python_builtins/__reducelib.py-73-
crates/vm/Lib/python_builtins/__reducelib.py:74:def slotnames(cls):
crates/vm/Lib/python_builtins/__reducelib.py-75-    if not isinstance(cls, type):
crates/vm/Lib/python_builtins/__reducelib.py-76-        return None
crates/vm/Lib/python_builtins/__reducelib.py-77-
crates/vm/Lib/python_builtins/__reducelib.py-78-    try:
crates/vm/Lib/python_builtins/__reducelib.py:79:        return cls.__dict__["__slotnames__"]
crates/vm/Lib/python_builtins/__reducelib.py-80-    except KeyError:
crates/vm/Lib/python_builtins/__reducelib.py-81-        pass
crates/vm/Lib/python_builtins/__reducelib.py-82-
crates/vm/Lib/python_builtins/__reducelib.py:83:    slotnames = copyreg._slotnames(cls)
crates/vm/Lib/python_builtins/__reducelib.py:84:    if not isinstance(slotnames, list) and slotnames is not None:
crates/vm/Lib/python_builtins/__reducelib.py:85:        raise TypeError("copyreg._slotnames didn't return a list or None")
crates/vm/Lib/python_builtins/__reducelib.py:86:    return slotnames

Script executed:

#!/bin/bash
# Find the copyreg._slotnames implementation
rg -n "_slotnames" crates/vm/Lib -A 20 | head -80

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

crates/vm/Lib/python_builtins/__reducelib.py:79:        return cls.__dict__["__slotnames__"]
crates/vm/Lib/python_builtins/__reducelib.py-80-    except KeyError:
crates/vm/Lib/python_builtins/__reducelib.py-81-        pass
crates/vm/Lib/python_builtins/__reducelib.py-82-
crates/vm/Lib/python_builtins/__reducelib.py:83:    slotnames = copyreg._slotnames(cls)
crates/vm/Lib/python_builtins/__reducelib.py-84-    if not isinstance(slotnames, list) and slotnames is not None:
crates/vm/Lib/python_builtins/__reducelib.py:85:        raise TypeError("copyreg._slotnames didn't return a list or None")
crates/vm/Lib/python_builtins/__reducelib.py-86-    return slotnames

Script executed:

#!/bin/bash
# Look for copyreg module implementation
find crates -name "copyreg*" -o -name "*copyreg*"

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

crates/vm/Lib/core_modules/copyreg.py

Script executed:

#!/bin/bash
# Check __reducelib.py for _slotnames and weakref handling
sed -n '1,150p' crates/vm/Lib/python_builtins/__reducelib.py

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

# Modified from code from the PyPy project:
# https://bitbucket.org/pypy/pypy/src/default/pypy/objspace/std/objectobject.py

# The MIT License

# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation
# files (the "Software"), to deal in the Software without
# restriction, including without limitation the rights to use,
# copy, modify, merge, publish, distribute, sublicense, and/or
# sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:

# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.

import copyreg


def _abstract_method_error(typ):
    methods = ", ".join(sorted(typ.__abstractmethods__))
    err = "Can't instantiate abstract class %s with abstract methods %s"
    raise TypeError(err % (typ.__name__, methods))


def reduce_2(obj):
    cls = obj.__class__

    try:
        getnewargs = obj.__getnewargs__
    except AttributeError:
        args = ()
    else:
        args = getnewargs()
        if not isinstance(args, tuple):
            raise TypeError("__getnewargs__ should return a tuple")

    try:
        getstate = obj.__getstate__
    except AttributeError:
        state = getattr(obj, "__dict__", None)
        names = slotnames(cls)  # not checking for list
        if names is not None:
            slots = {}
            for name in names:
                try:
                    value = getattr(obj, name)
                except AttributeError:
                    pass
                else:
                    slots[name] = value
            if slots:
                state = state, slots
    else:
        state = getstate()

    listitems = iter(obj) if isinstance(obj, list) else None
    dictitems = iter(obj.items()) if isinstance(obj, dict) else None

    newobj = copyreg.__newobj__

    args2 = (cls,) + args
    return newobj, args2, state, listitems, dictitems


def slotnames(cls):
    if not isinstance(cls, type):
        return None

    try:
        return cls.__dict__["__slotnames__"]
    except KeyError:
        pass

    slotnames = copyreg._slotnames(cls)
    if not isinstance(slotnames, list) and slotnames is not None:
        raise TypeError("copyreg._slotnames didn't return a list or None")
    return slotnames

Script executed:

#!/bin/bash
# Search for __weakref__ references in the codebase
rg -n "__weakref__" crates -g '*.rs' -g '*.py'

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

crates/doc/src/data.inc.rs:483:    "_curses.error.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:629:    "_curses_panel.error.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:691:    "_dbm.error.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:730:    "_gdbm.error.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:839:    "_hashlib.UnsupportedDigestmodError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:912:    "_interpchannels.ChannelClosedError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:937:    "_interpchannels.ChannelEmptyError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:962:    "_interpchannels.ChannelError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:1061:    "_interpchannels.ChannelNotEmptyError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:1086:    "_interpchannels.ChannelNotFoundError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:1934:    "_lzma.LZMAError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:2235:    "_pickle.PickleError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:2284:    "_pickle.PicklingError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:2333:    "_pickle.UnpicklingError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:2369:    "_queue.Empty.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:3294:    "_tkinter.TclError.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:3598:    "binascii.Error.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:3623:    "binascii.Incomplete.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:4074:    "builtins.DynamicClassAttribute.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:4203:    "builtins.ExceptionGroup.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:9293:    "termios.error.__weakref__" => "list of weak references to the object",
crates/doc/src/data.inc.rs:9558:    "zlib.error.__weakref__" => "list of weak references to the object",
crates/vm/src/builtins/type.rs:1232:                .any(|s| s.as_str() != "__dict__" && s.as_str() != "__weakref__");
crates/vm/src/builtins/type.rs:1261:                // Check for duplicate __weakref__
crates/vm/src/builtins/type.rs:1262:                if slot_name == "__weakref__" {
crates/vm/src/builtins/type.rs:1265:                            "__weakref__ slot disallowed: we already got one".to_owned(),
crates/vm/src/builtins/type.rs:1287:            // Check if base class already has __weakref__ - can't redefine it
crates/vm/src/builtins/type.rs:1290:            // 2. It's a heap type with __weakref__ in its __slots__
crates/vm/src/builtins/type.rs:1296:                        // Heap type with __slots__ - check if __weakref__ is in slots
crates/vm/src/builtins/type.rs:1297:                        Some(base_slots) => base_slots.iter().any(|s| s.as_str() == "__weakref__"),
crates/vm/src/builtins/type.rs:1300:                    // Builtin type - check if it has __weakref__ descriptor
crates/vm/src/builtins/type.rs:1301:                    let weakref_name = vm.ctx.intern_str("__weakref__");
crates/vm/src/builtins/type.rs:1307:                        "__weakref__ slot disallowed: we already got one".to_owned(),

Script executed:

#!/bin/bash
# Check if there's a native copyreg._slotnames implementation
cat crates/vm/Lib/core_modules/copyreg.py | head -200

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

"""Helper to provide extensibility for pickle.

This is only useful to add pickle support for extension types defined in
C, not for instances of user-defined classes.
"""

__all__ = ["pickle", "constructor",
           "add_extension", "remove_extension", "clear_extension_cache"]

dispatch_table = {}

def pickle(ob_type, pickle_function, constructor_ob=None):
    if not callable(pickle_function):
        raise TypeError("reduction functions must be callable")
    dispatch_table[ob_type] = pickle_function

    # The constructor_ob function is a vestige of safe for unpickling.
    # There is no reason for the caller to pass it anymore.
    if constructor_ob is not None:
        constructor(constructor_ob)

def constructor(object):
    if not callable(object):
        raise TypeError("constructors must be callable")

# Example: provide pickling support for complex numbers.

def pickle_complex(c):
    return complex, (c.real, c.imag)

pickle(complex, pickle_complex, complex)

def pickle_union(obj):
    import functools, operator
    return functools.reduce, (operator.or_, obj.__args__)

pickle(type(int | str), pickle_union)

# Support for pickling new-style objects

def _reconstructor(cls, base, state):
    if base is object:
        obj = object.__new__(cls)
    else:
        obj = base.__new__(cls, state)
        if base.__init__ != object.__init__:
            base.__init__(obj, state)
    return obj

_HEAPTYPE = 1<<9
_new_type = type(int.__new__)

# Python code for object.__reduce_ex__ for protocols 0 and 1

def _reduce_ex(self, proto):
    assert proto < 2
    cls = self.__class__
    for base in cls.__mro__:
        if hasattr(base, '__flags__') and not base.__flags__ & _HEAPTYPE:
            break
        new = base.__new__
        if isinstance(new, _new_type) and new.__self__ is base:
            break
    else:
        base = object # not really reachable
    if base is object:
        state = None
    else:
        if base is cls:
            raise TypeError(f"cannot pickle {cls.__name__!r} object")
        state = base(self)
    args = (cls, base, state)
    try:
        getstate = self.__getstate__
    except AttributeError:
        if getattr(self, "__slots__", None):
            raise TypeError(f"cannot pickle {cls.__name__!r} object: "
                            f"a class that defines __slots__ without "
                            f"defining __getstate__ cannot be pickled "
                            f"with protocol {proto}") from None
        try:
            dict = self.__dict__
        except AttributeError:
            dict = None
    else:
        if (type(self).__getstate__ is object.__getstate__ and
            getattr(self, "__slots__", None)):
            raise TypeError("a class that defines __slots__ without "
                            "defining __getstate__ cannot be pickled")
        dict = getstate()
    if dict:
        return _reconstructor, args, dict
    else:
        return _reconstructor, args

# Helper for __reduce_ex__ protocol 2

def __newobj__(cls, *args):
    return cls.__new__(cls, *args)

def __newobj_ex__(cls, args, kwargs):
    """Used by pickle protocol 4, instead of __newobj__ to allow classes with
    keyword-only arguments to be pickled correctly.
    """
    return cls.__new__(cls, *args, **kwargs)

def _slotnames(cls):
    """Return a list of slot names for a given class.

    This needs to find slots defined by the class and its bases, so we
    can't simply return the __slots__ attribute.  We must walk down
    the Method Resolution Order and concatenate the __slots__ of each
    class found there.  (This assumes classes don't modify their
    __slots__ attribute to misrepresent their slots after the class is
    defined.)
    """

    # Get the value from a cache in the class if possible
    names = cls.__dict__.get("__slotnames__")
    if names is not None:
        return names

    # Not cached -- calculate the value
    names = []
    if not hasattr(cls, "__slots__"):
        # This class has no slots
        pass
    else:
        # Slots found -- gather slot names from all base classes
        for c in cls.__mro__:
            if "__slots__" in c.__dict__:
                slots = c.__dict__['__slots__']
                # if class has a single slot, it can be given as a string
                if isinstance(slots, str):
                    slots = (slots,)
                for name in slots:
                    # special descriptors
                    if name in ("__dict__", "__weakref__"):
                        continue
                    # mangled names
                    elif name.startswith('__') and not name.endswith('__'):
                        stripped = c.__name__.lstrip('_')
                        if stripped:
                            names.append('_%s%s' % (stripped, name))
                        else:
                            names.append(name)
                    else:
                        names.append(name)

    # Cache the outcome in the class if at all possible
    try:
        cls.__slotnames__ = names
    except:
        pass # But don't die if we can't

    return names

# A registry of extension codes.  This is an ad-hoc compression
# mechanism.  Whenever a global reference to <module>, <name> is about
# to be pickled, the (<module>, <name>) tuple is looked up here to see
# if it is a registered extension code for it.  Extension codes are
# universal, so that the meaning of a pickle does not depend on
# context.  (There are also some codes reserved for local use that
# don't have this restriction.)  Codes are positive ints; 0 is
# reserved.

_extension_registry = {}                # key -> code
_inverted_registry = {}                 # code -> key
_extension_cache = {}                   # code -> object
# Don't ever rebind those names:  pickling grabs a reference to them when
# it's initialized, and won't see a rebinding.

def add_extension(module, name, code):
    """Register an extension code."""
    code = int(code)
    if not 1 <= code <= 0x7fffffff:
        raise ValueError("code out of range")
    key = (module, name)
    if (_extension_registry.get(key) == code and
        _inverted_registry.get(code) == key):
        return # Redundant registrations are benign
    if key in _extension_registry:
        raise ValueError("key %s is already registered with code %s" %
                         (key, _extension_registry[key]))
    if code in _inverted_registry:
        raise ValueError("code %s is already in use for key %s" %
                         (code, _inverted_registry[code]))
    _extension_registry[key] = code
    _inverted_registry[code] = key

def remove_extension(module, name, code):
    """Unregister an extension code.  For testing only."""
    key = (module, name)
    if (_extension_registry.get(key) != code or
        _inverted_registry.get(code) != key):
        raise ValueError("key %s is not registered with code %s" %
                         (key, code))
    del _extension_registry[key]
    del _inverted_registry[code]
    if code in _extension_cache:

Script executed:

#!/bin/bash
# Search for where weakref descriptor/slot is actually stored/handled
rg -n "weakref" crates/vm/src/builtins/type.rs -B 2 -A 2 | head -100

Repository: youknowone/RustPython


Repository: youknowone/RustPython
Exit code: 0

stdout:

347-        new_type.init_slots(ctx);
348-
349:        let weakref_type = super::PyWeak::static_type();
350-        for base in new_type.bases.read().iter() {
351-            base.subclasses.write().push(
352-                new_type
353-                    .as_object()
354:                    .downgrade_with_weakref_typ_opt(None, weakref_type.to_owned())
355-                    .unwrap(),
356-            );
--
403-        Self::set_new(&new_type.slots, &new_type.base);
404-
405:        let weakref_type = super::PyWeak::static_type();
406-        for base in new_type.bases.read().iter() {
407-            base.subclasses.write().push(
408-                new_type
409-                    .as_object()
410:                    .downgrade_with_weakref_typ_opt(None, weakref_type.to_owned())
411-                    .unwrap(),
412-            );
--
758-
759-        // Register this type as a subclass of its new bases
760:        let weakref_type = super::PyWeak::static_type();
761-        for base in zelf.bases.read().iter() {
762-            base.subclasses.write().push(
763-                zelf.as_object()
764:                    .downgrade_with_weakref_typ_opt(None, weakref_type.to_owned())
765-                    .unwrap(),
766-            );
--
1227-            // Check if base has itemsize > 0 - can't add arbitrary slots to variable-size types
1228-            // Types like int, bytes, tuple have itemsize > 0 and don't allow custom slots
1229:            // But types like weakref.ref have itemsize = 0 and DO allow slots
1230-            let has_custom_slots = slots
1231-                .iter()
1232:                .any(|s| s.as_str() != "__dict__" && s.as_str() != "__weakref__");
1233-            if has_custom_slots && base.slots.itemsize > 0 {
1234-                return Err(vm.new_type_error(format!(
--
1240-            // Validate slot names and track duplicates
1241-            let mut seen_dict = false;
1242:            let mut seen_weakref = false;
1243-            for slot in slots.iter() {
1244-                // Use isidentifier for validation (handles Unicode properly)
--
1259-                }
1260-
1261:                // Check for duplicate __weakref__
1262:                if slot_name == "__weakref__" {
1263:                    if seen_weakref {
1264-                        return Err(vm.new_type_error(
1265:                            "__weakref__ slot disallowed: we already got one".to_owned(),
1266-                        ));
1267-                    }
1268:                    seen_weakref = true;
1269-                }
1270-
--
1285-            }
1286-
1287:            // Check if base class already has __weakref__ - can't redefine it
1288:            // A base has weakref support if:
1289:            // 1. It's a heap type without explicit __slots__ (automatic weakref), OR
1290:            // 2. It's a heap type with __weakref__ in its __slots__
1291:            if seen_weakref {
1292:                let base_has_weakref = if let Some(ref ext) = base.heaptype_ext {
1293-                    match &ext.slots {
1294:                        // Heap type without __slots__ - has automatic weakref
1295-                        None => true,
1296:                        // Heap type with __slots__ - check if __weakref__ is in slots
1297:                        Some(base_slots) => base_slots.iter().any(|s| s.as_str() == "__weakref__"),
1298-                    }
1299-                } else {
1300:                    // Builtin type - check if it has __weakref__ descriptor
1301:                    let weakref_name = vm.ctx.intern_str("__weakref__");
1302:                    base.attributes.read().contains_key(weakref_name)
1303-                };
1304-
1305:                if base_has_weakref {
1306-                    return Err(vm.new_type_error(
1307:                        "__weakref__ slot disallowed: we already got one".to_owned(),
1308-                    ));
1309-                }

Account for weakref storage in the basicsize check.

Line 215: the expected basicsize only adds dict + slot entries. However, __weakref__ is excluded from copyreg._slotnames (see copyreg.py line 144), so types with weakref support may incorrectly fail the basicsize > expected check.

Note: The proposed fix cannot use PyTypeFlags::HAS_WEAKREF since this flag does not exist in RustPython. A correct fix would instead need to check if the type's __slots__ contains "__weakref__" (via the heaptype_ext if available) or detect automatic weakref support on heap types. Consider examining how CPython's pickle handles this case and whether the basicsize calculation should account for it separately.

🤖 Prompt for AI Agents
In `@crates/vm/src/builtins/object.rs` around lines 210 - 229, The basicsize
calculation for pickle currently omits space for automatic weakref storage,
causing types with a "__weakref__" slot to falsely fail the size check; update
the block in object.rs (around basicsize, obj.class(), slot_names) to detect if
the type provides weakref storage and add core::mem::size_of::<PyObjectRef>() to
basicsize when present—do this by checking the type's slots for a "__weakref__"
entry (e.g., inspect slot_names for "__weakref__" or use the heap-type
extension/heap type metadata on obj.class() to detect automatic weakref support)
rather than relying on a non-existent PyTypeFlags::HAS_WEAKREF, then re-run the
existing comparison against obj.class().slots.basicsize.

Comment on lines +637 to +661
/// object_getstate - calls __getstate__ method or default implementation
fn object_getstate(obj: &PyObject, required: bool, vm: &VirtualMachine) -> PyResult {
// If __getstate__ is not overridden, use the default implementation with required flag
if !is_getstate_overridden(obj, vm) {
return object_getstate_default(obj, required, vm);
}

// __getstate__ is overridden, call it without required
let getstate = obj.get_attr(identifier!(vm, __getstate__), vm)?;
getstate.call((), vm)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against __getstate__ = None before calling.

Line 645: if a class sets __getstate__ = None, is_getstate_overridden() returns true and this path attempts to call None, raising a TypeError. The prior behavior returned None state.

🐛 Proposed fix
     // __getstate__ is overridden, call it without required
     let getstate = obj.get_attr(identifier!(vm, __getstate__), vm)?;
-    getstate.call((), vm)
+    if vm.is_none(&getstate) {
+        return Ok(vm.ctx.none());
+    }
+    getstate.call((), vm)
🤖 Prompt for AI Agents
In `@crates/vm/src/builtins/object.rs` around lines 637 - 647, The code calls the
retrieved __getstate__ attribute unconditionally, which fails if a class sets
__getstate__ = None; modify object_getstate to, after obtaining getstate via
obj.get_attr(identifier!(vm, __getstate__), vm) and before calling
getstate.call(...), check whether the attribute is None (using the
PyObject/Value API available in this module/context) and if so return the None
state (e.g. Ok(vm.ctx.none()))—otherwise proceed to call getstate; keep the
existing fallback to object_getstate_default and existing error propagation from
get_attr intact.

Comment on lines +4106 to +4127
fn __setstate__(zelf: PyRef<Self>, state: PyTupleRef, vm: &VirtualMachine) -> PyResult<()> {
if state.len() != 4 {
return Err(vm.new_type_error(format!(
"__setstate__ argument should be 4-tuple, got {}",
state.len()
)));
}

let content: PyStrRef = state[0].clone().try_into_value(vm)?;
// state[1] is newline - TODO: use when newline handling is implemented
let pos: u64 = state[2].clone().try_into_value(vm)?;
let dict = &state[3];

// Set content
let raw_bytes = content.as_bytes().to_vec();
*zelf.buffer.write() = BufferedIO::new(Cursor::new(raw_bytes));

// Set position
zelf.buffer(vm)?
.seek(SeekFrom::Start(pos))
.map_err(|err| os_err(vm, err))?;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid mutating closed StringIO before validation.

Line 4120 resets the buffer via write() before buffer(vm)? enforces the closed check, so a closed object can be partially mutated even though the call later errors. Prefer validating once and performing mutation through the validated guard.

🛠️ Suggested fix
-            // Set content
-            let raw_bytes = content.as_bytes().to_vec();
-            *zelf.buffer.write() = BufferedIO::new(Cursor::new(raw_bytes));
-
-            // Set position
-            zelf.buffer(vm)?
-                .seek(SeekFrom::Start(pos))
-                .map_err(|err| os_err(vm, err))?;
+            // Validate closed state and update buffer atomically
+            let mut buffer = zelf.buffer(vm)?;
+            let raw_bytes = content.as_bytes().to_vec();
+            *buffer = BufferedIO::new(Cursor::new(raw_bytes));
+            buffer
+                .seek(SeekFrom::Start(pos))
+                .map_err(|err| os_err(vm, err))?;
+            drop(buffer);
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/io.rs` around lines 4106 - 4127, The __setstate__
implementation mutates zelf.buffer via write() before calling zelf.buffer(vm)?
which performs the closed-object validation; change the flow to validate first
by calling zelf.buffer(vm)? to obtain the validated guard and then perform
mutations through that guard (replace the BufferedIO via the guard and seek
using the validated guard), removing the earlier direct write() call and
ensuring no mutation occurs before the closed check.

Comment on lines 4311 to 4334
#[pymethod]
fn __setstate__(zelf: PyRef<Self>, state: PyTupleRef, vm: &VirtualMachine) -> PyResult<()> {
if zelf.closed.load() {
return Err(vm.new_value_error("__setstate__ on closed file"));
}
if state.len() != 3 {
return Err(vm.new_type_error(format!(
"__setstate__ argument should be 3-tuple, got {}",
state.len()
)));
}

let content: PyBytesRef = state[0].clone().try_into_value(vm)?;
let pos: u64 = state[1].clone().try_into_value(vm)?;
let dict = &state[2];

// Set content
*zelf.buffer.write() = BufferedIO::new(Cursor::new(content.as_bytes().to_vec()));

// Set position
zelf.buffer(vm)?
.seek(SeekFrom::Start(pos))
.map_err(|err| os_err(vm, err))?;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Respect BytesIO export safety when restoring state.

Line 4328 overwrites the buffer without checking active exports. This bypasses the existing resize guard (exports) and can violate buffer protocol expectations. Use the same resizable guard used by write/truncate.

🛠️ Suggested fix
-            // Set content
-            *zelf.buffer.write() = BufferedIO::new(Cursor::new(content.as_bytes().to_vec()));
-
-            // Set position
-            zelf.buffer(vm)?
-                .seek(SeekFrom::Start(pos))
-                .map_err(|err| os_err(vm, err))?;
+            // Set content (respect export guards)
+            let mut buffer = zelf.try_resizable(vm)?;
+            *buffer = BufferedIO::new(Cursor::new(content.as_bytes().to_vec()));
+            buffer
+                .seek(SeekFrom::Start(pos))
+                .map_err(|err| os_err(vm, err))?;
+            drop(buffer);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[pymethod]
fn __setstate__(zelf: PyRef<Self>, state: PyTupleRef, vm: &VirtualMachine) -> PyResult<()> {
if zelf.closed.load() {
return Err(vm.new_value_error("__setstate__ on closed file"));
}
if state.len() != 3 {
return Err(vm.new_type_error(format!(
"__setstate__ argument should be 3-tuple, got {}",
state.len()
)));
}
let content: PyBytesRef = state[0].clone().try_into_value(vm)?;
let pos: u64 = state[1].clone().try_into_value(vm)?;
let dict = &state[2];
// Set content
*zelf.buffer.write() = BufferedIO::new(Cursor::new(content.as_bytes().to_vec()));
// Set position
zelf.buffer(vm)?
.seek(SeekFrom::Start(pos))
.map_err(|err| os_err(vm, err))?;
#[pymethod]
fn __setstate__(zelf: PyRef<Self>, state: PyTupleRef, vm: &VirtualMachine) -> PyResult<()> {
if zelf.closed.load() {
return Err(vm.new_value_error("__setstate__ on closed file"));
}
if state.len() != 3 {
return Err(vm.new_type_error(format!(
"__setstate__ argument should be 3-tuple, got {}",
state.len()
)));
}
let content: PyBytesRef = state[0].clone().try_into_value(vm)?;
let pos: u64 = state[1].clone().try_into_value(vm)?;
let dict = &state[2];
// Set content (respect export guards)
let mut buffer = zelf.try_resizable(vm)?;
*buffer = BufferedIO::new(Cursor::new(content.as_bytes().to_vec()));
buffer
.seek(SeekFrom::Start(pos))
.map_err(|err| os_err(vm, err))?;
drop(buffer);
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/io.rs` around lines 4311 - 4334, __setstate__ currently
overwrites zelf.buffer.write() without honoring active exports, bypassing the
resize guard used by write/truncate; change the code to acquire the same
resizable/export guard used by the write/truncate paths before replacing the
buffer: use zelf.buffer(vm)? (the same accessor used below) to obtain the buffer
handle that enforces exports/resize safety (the guard used by write/truncate),
then replace the inner BufferedIO (BufferedIO::new(Cursor::new(...))) via that
guarded handle instead of directly calling zelf.buffer.write(), so active
exports are checked and the buffer protocol invariants are preserved.

terryluan12 and others added 7 commits January 16, 2026 23:03
* Updated urllib + urllib tests

* Updated http + test

* Annotated failing tests

* Fixed issues in httpservers, robotparser and urllib2net

* Fixed windows only success

* Annotated flaky test in test_logging
* Add all other bytecodes

* Mark passing/failing tests
* correct buitins type

* Auto-format: ruff format

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants