Skip to content

Conversation

oremanj
Copy link
Collaborator

@oremanj oremanj commented Oct 11, 2025

Description

There are a few ways for pybind11 to perform a from-Python cast that yields a pointer to the C++ object through other than the usual mechanism: loading an instance of a different module's module-local type, loading through the conduit mechanism, and more upcoming in #5800. These have in common that they currently ignore any holders, so trying to load a std::shared_ptr<T> for such a T will appear to succeed but will produce a default-constructed shared_ptr. Solve this issue by adding a new method set_foreign_holder() which should be implemented by subclasses of type_caster_base (such as copyable_holder_caster) and will be called after the base caster has successfully loaded the value pointer for these foreign-sourced loads.

The provided implementation of set_foreign_holder supports only loads to shared_ptr<T>, and it does so using a different approach than native loads: since the true holder is not available, we create a new shared_ptr that manages a reference to the Python object. (If the object supports enable_shared_from_this, then we use that to locate the true holder and take another reference to it.)

Suggested changelog entry:

When loading an instance of a pybind11 class T whose type info is not available to the module that's loading it -- i.e., if it's a module-local type defined in a different module, or a type defined by a different version of pybind11 and used through the conduit mechanism -- pybind11 can now populate a std::shared_ptr<T>, in addition to the previous support for a raw T*. Note that unless T implements enable_shared_from_this, the resulting shared_ptr will own a new reference to the underlying Python object, rather than sharing ownership with the C++ object inside the Python object.


📚 Documentation preview 📚: https://pybind11--5862.org.readthedocs.build/

struct py_deleter {
void operator()(const void *) const noexcept {
// Don't run the deleter if the interpreter has been shut down
if (Py_IsInitialized() == 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR: Could you please make this

            if (Py_IsInitialized() == 0 || _Py_IsFinalizing() != 0) {

?


Separately later: see draft PR #5864

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Py_IsFinalizing remains true after Py_Finalize[Ex] has returned, so a true return from it does not imply it's safe to decref. Doing a decref after the interpreter has finalized will crash the program; skipping one during finalization (when it would still be OK) causes a memory leak. Of the two, I prefer the memory leak.

The more correct solution would be to have our own "interpreter has finalized" flag that we set from either the destructor of the internals capsule or a Py_AtExit callback. (Probably only the former is subinterpreter-safe.) Both of those occur very late in interpreter finalization, and in particular occur after the final GC pass.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm ...

skipping one during finalization (when it would still be OK)

I assumed it's not OK. — How can we gain certainty?

Claude and ChatGPT both suggest that guarding with !Py_IsFinalizing() is important.

Py_IsFinalizing remains true after Py_Finalize[Ex] has returned

I assumed Py_IsInitialized() will then be false.

We have this existing code:

#ifdef GRAALVM_PYTHON
if (!Py_IsInitialized() || _Py_IsFinalizing()) {
return;
}
#endif

That's GraalPy-specific, but I was thinking it's correct in general.

I also assumed _Py_IsFinalizing() still works with all Python versions, although only Py_IsFinalizing() is actually a public API, since Python 3.9; but we are still supporting Python 3.8.


Everything else in this PR looks great to me, it's just this one line that I'm still worried about.

Copy link
Collaborator Author

@oremanj oremanj Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I misread your suggestion as trying to allow decref during finalization. As written, the IsFinalizing term is redundant because IsFinalizing becomes true exactly when IsInitialized becomes false. From cpython/Python/pylifecycle.c:

    _PyInterpreterState_SetFinalizing(tstate->interp, tstate);
    _PyRuntimeState_SetFinalizing(runtime, tstate);
    runtime->initialized = 0;
    runtime->core_initialized = 0;

IsFinalizing checks _PyRuntimeState_GetFinalizing while IsInitialized checks runtime->initialized.

I assumed it's not OK. — How can we gain certainty?

The best documentation I know of what happens during finalization is hudson-trading/pymetabind#2 (comment) -- this is not an API guarantee / could change, but as the comment itself notes, that doesn't mean we don't have to interact with it. Pablo is a CPython core developer who has done a lot of work on the finalization process.

The higher-level reason to believe it's OK to decref during finalization is that Python itself decrefs during finalization all over the place! Python actually goes to fairly great lengths to attempt to ensure that there are no objects left alive when the interpreter exits. That requires a lot of individual references to be broken. There is an entire GC pass that occurs after IsFinalizing becomes true.

We can assume that it's still OK to decref when our capsule destructor is invoked because our capsule destructor was invoked by a decref (the one took the refcount of the capsule to zero).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have time to comb through the cpython code, so I fell back to asking ChatGPT again (same URL, one extra prompt/response):

https://chatgpt.com/share/68ed515d-82e4-8008-8954-784843385bf9

It explained why it still recommends also checking Py_IsFinalizing() although only after acquiring the GIL (it suggested that before but I overlooked the specific order before).

Up to you. — In my code, I'd definitely want to use the suggested pattern, it's a very easy and inexpensive way to (maybe) err on the safer side.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChatGPT is mostly wrong here. I will respond to its points in detail.

The order in CPython is “mark finalizing” → “clear initialized.”

True.

On a foreign thread without the GIL, you can observe finalizing == true while initialized == true for a moment.

True but irrelevant. When finalizing is set, non-daemon threads have already been joined, and daemonic threads are detached (unable to execute Python code). If this is running from a daemonic thread, it doesn't matter which value we observe for IsInitializing, because if we're finalizing then the thread will either freeze or exit when we attempt to acquire the GIL immediately below. (Since GIL acquisition can't fail, it freezes if you try to do it during finalization. This is documented and I believe there's a PEP that's trying to fix it.) We can only wind up actually executing the DECREF if we're not finalizing yet or if we're running in the main thread. But in the latter case, the race does not exist.

Subinterpreters: Py_IsInitialized() is a process-wide flag; it says nothing about whether a particular interpreter (or the one owning your object) is already tearing down. Py_IsFinalizing() is also global today, so neither API tells you “this object’s interpreter is safe.” Treat them as coarse signals, not per-interpreter guarantees.

This does not point at any distinction between the validity of the logic with the IsFinalizing check vs without.

CPython does a lot of decref/cleanup during shutdown, but under tight control:

This is FUD. It's a safe approximation but doesn't understand the details.

It holds the GIL, knows exactly which objects it’s touching, and often orders teardown to avoid arbitrary callbacks (or at least accepts the risk in well-understood places).

The final GC pass executes well after IsFinalizing becomes true and can destroy arbitrary user-created objects that do arbitrary things.

Your extension’s Py_DECREF from a foreign thread or a C++ destructor can invoke: [...]

Foreign thread is specious as explained above. Thus, any shared_ptr destruction that occurs during finalization must be indirectly executing from within a finalizer of a user-provided Python object. Thus it's OK for us to potentially invoke other finalizers of user-provided Python objects.

shared_ptr destruction after finalization is common if a shared_ptr is held in a C++-level global, and it is important that we not call into Python there. That's the main reason for the IsInitialized check.

Capsule destructor case: the fact that a decref inside CPython dropped the capsule to zero (calling your destructor) doesn’t make your subsequent Py_DECREFs safe—your destructor may touch objects tied to interpreters or modules already mid-teardown.

This is confusing the question about whether we can decref an arbitrary object during finalization (answer: yes because we're only doing it from within another decref of an arbitrary object) with the question about what we can do in the capsule destructor. The capsule destructor executes much later - clearing the interpreter state dict is one of the last things done during finalization. We should not run arbitrary Python code there. But we're not: I'm just proposing we have a C++ function that sets a C++ flag.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're lightyears ahead of me with your understanding of the cpython internals, especially threading and cleanup mechanics. Thanks for the explanation!


def test_shared(val, ctor, loader):
obj = ctor(val)
with suppress(AttributeError): # non-cpython VMs don't have getrefcount
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msimacek I just looked:

$ cd tests/
$ git grep 'Cannot reliably trigger GC' | wc -l
48

Do you think we could enable more test coverage for GraalPy by adopting Joshua's approach here more widely?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem of gc.collect() not reliably GCing things is broader than the problem of not having refcounts. PyPy doesn't have refcounts but its gc.collect() works (if you call it a few times).

@oremanj oremanj requested a review from henryiii as a code owner October 13, 2025 18:20
@rwgk
Copy link
Collaborator

rwgk commented Oct 13, 2025

BTW: This job is hanging:

CI / 🐍 (macos-latest, 3.14t, -DCMAKE_CXX_STANDARD=20) / 🧪 (pull_request)

I've seen this several times before. It's definitely not related to this PR. I usually cancel and rerun.

@oremanj oremanj merged commit c7b4f66 into pybind:master Oct 14, 2025
84 of 85 checks passed
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Oct 14, 2025
@oremanj oremanj deleted the set-foreign-holder branch October 14, 2025 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs changelog Possibly needs a changelog entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants