-
Notifications
You must be signed in to change notification settings - Fork 2.2k
type_caster_generic: add set_foreign_holder method for subclasses to implement #5862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…usion against try_get_shared_from_this
include/pybind11/cast.h
Outdated
struct py_deleter { | ||
void operator()(const void *) const noexcept { | ||
// Don't run the deleter if the interpreter has been shut down | ||
if (Py_IsInitialized() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this PR: Could you please make this
if (Py_IsInitialized() == 0 || _Py_IsFinalizing() != 0) {
?
Separately later: see draft PR #5864
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Py_IsFinalizing remains true after Py_Finalize[Ex] has returned, so a true return from it does not imply it's safe to decref. Doing a decref after the interpreter has finalized will crash the program; skipping one during finalization (when it would still be OK) causes a memory leak. Of the two, I prefer the memory leak.
The more correct solution would be to have our own "interpreter has finalized" flag that we set from either the destructor of the internals capsule or a Py_AtExit callback. (Probably only the former is subinterpreter-safe.) Both of those occur very late in interpreter finalization, and in particular occur after the final GC pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ...
skipping one during finalization (when it would still be OK)
I assumed it's not OK. — How can we gain certainty?
Claude and ChatGPT both suggest that guarding with !Py_IsFinalizing()
is important.
Py_IsFinalizing remains true after Py_Finalize[Ex] has returned
I assumed Py_IsInitialized()
will then be false.
We have this existing code:
pybind11/include/pybind11/detail/internals.h
Lines 105 to 109 in 9f75202
#ifdef GRAALVM_PYTHON | |
if (!Py_IsInitialized() || _Py_IsFinalizing()) { | |
return; | |
} | |
#endif |
That's GraalPy-specific, but I was thinking it's correct in general.
I also assumed _Py_IsFinalizing()
still works with all Python versions, although only Py_IsFinalizing()
is actually a public API, since Python 3.9; but we are still supporting Python 3.8.
Everything else in this PR looks great to me, it's just this one line that I'm still worried about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I misread your suggestion as trying to allow decref during finalization. As written, the IsFinalizing term is redundant because IsFinalizing becomes true exactly when IsInitialized becomes false. From cpython/Python/pylifecycle.c
:
_PyInterpreterState_SetFinalizing(tstate->interp, tstate);
_PyRuntimeState_SetFinalizing(runtime, tstate);
runtime->initialized = 0;
runtime->core_initialized = 0;
IsFinalizing checks _PyRuntimeState_GetFinalizing
while IsInitialized checks runtime->initialized
.
I assumed it's not OK. — How can we gain certainty?
The best documentation I know of what happens during finalization is hudson-trading/pymetabind#2 (comment) -- this is not an API guarantee / could change, but as the comment itself notes, that doesn't mean we don't have to interact with it. Pablo is a CPython core developer who has done a lot of work on the finalization process.
The higher-level reason to believe it's OK to decref during finalization is that Python itself decrefs during finalization all over the place! Python actually goes to fairly great lengths to attempt to ensure that there are no objects left alive when the interpreter exits. That requires a lot of individual references to be broken. There is an entire GC pass that occurs after IsFinalizing becomes true.
We can assume that it's still OK to decref when our capsule destructor is invoked because our capsule destructor was invoked by a decref (the one took the refcount of the capsule to zero).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have time to comb through the cpython code, so I fell back to asking ChatGPT again (same URL, one extra prompt/response):
https://chatgpt.com/share/68ed515d-82e4-8008-8954-784843385bf9
It explained why it still recommends also checking Py_IsFinalizing()
although only after acquiring the GIL (it suggested that before but I overlooked the specific order before).
Up to you. — In my code, I'd definitely want to use the suggested pattern, it's a very easy and inexpensive way to (maybe) err on the safer side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChatGPT is mostly wrong here. I will respond to its points in detail.
The order in CPython is “mark finalizing” → “clear initialized.”
True.
On a foreign thread without the GIL, you can observe finalizing == true while initialized == true for a moment.
True but irrelevant. When finalizing is set, non-daemon threads have already been joined, and daemonic threads are detached (unable to execute Python code). If this is running from a daemonic thread, it doesn't matter which value we observe for IsInitializing, because if we're finalizing then the thread will either freeze or exit when we attempt to acquire the GIL immediately below. (Since GIL acquisition can't fail, it freezes if you try to do it during finalization. This is documented and I believe there's a PEP that's trying to fix it.) We can only wind up actually executing the DECREF if we're not finalizing yet or if we're running in the main thread. But in the latter case, the race does not exist.
Subinterpreters: Py_IsInitialized() is a process-wide flag; it says nothing about whether a particular interpreter (or the one owning your object) is already tearing down. Py_IsFinalizing() is also global today, so neither API tells you “this object’s interpreter is safe.” Treat them as coarse signals, not per-interpreter guarantees.
This does not point at any distinction between the validity of the logic with the IsFinalizing check vs without.
CPython does a lot of decref/cleanup during shutdown, but under tight control:
This is FUD. It's a safe approximation but doesn't understand the details.
It holds the GIL, knows exactly which objects it’s touching, and often orders teardown to avoid arbitrary callbacks (or at least accepts the risk in well-understood places).
The final GC pass executes well after IsFinalizing becomes true and can destroy arbitrary user-created objects that do arbitrary things.
Your extension’s Py_DECREF from a foreign thread or a C++ destructor can invoke: [...]
Foreign thread is specious as explained above. Thus, any shared_ptr destruction that occurs during finalization must be indirectly executing from within a finalizer of a user-provided Python object. Thus it's OK for us to potentially invoke other finalizers of user-provided Python objects.
shared_ptr destruction after finalization is common if a shared_ptr is held in a C++-level global, and it is important that we not call into Python there. That's the main reason for the IsInitialized check.
Capsule destructor case: the fact that a decref inside CPython dropped the capsule to zero (calling your destructor) doesn’t make your subsequent Py_DECREFs safe—your destructor may touch objects tied to interpreters or modules already mid-teardown.
This is confusing the question about whether we can decref an arbitrary object during finalization (answer: yes because we're only doing it from within another decref of an arbitrary object) with the question about what we can do in the capsule destructor. The capsule destructor executes much later - clearing the interpreter state dict is one of the last things done during finalization. We should not run arbitrary Python code there. But we're not: I'm just proposing we have a C++ function that sets a C++ flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're lightyears ahead of me with your understanding of the cpython internals, especially threading and cleanup mechanics. Thanks for the explanation!
|
||
def test_shared(val, ctor, loader): | ||
obj = ctor(val) | ||
with suppress(AttributeError): # non-cpython VMs don't have getrefcount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msimacek I just looked:
$ cd tests/
$ git grep 'Cannot reliably trigger GC' | wc -l
48
Do you think we could enable more test coverage for GraalPy by adopting Joshua's approach here more widely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem of gc.collect() not reliably GCing things is broader than the problem of not having refcounts. PyPy doesn't have refcounts but its gc.collect() works (if you call it a few times).
BTW: This job is hanging: CI / 🐍 (macos-latest, 3.14t, -DCMAKE_CXX_STANDARD=20) / 🧪 (pull_request) I've seen this several times before. It's definitely not related to this PR. I usually cancel and rerun. |
Description
There are a few ways for pybind11 to perform a from-Python cast that yields a pointer to the C++ object through other than the usual mechanism: loading an instance of a different module's module-local type, loading through the conduit mechanism, and more upcoming in #5800. These have in common that they currently ignore any holders, so trying to load a
std::shared_ptr<T>
for such a T will appear to succeed but will produce a default-constructedshared_ptr
. Solve this issue by adding a new methodset_foreign_holder()
which should be implemented by subclasses oftype_caster_base
(such ascopyable_holder_caster
) and will be called after the base caster has successfully loaded the value pointer for these foreign-sourced loads.The provided implementation of
set_foreign_holder
supports only loads toshared_ptr<T>
, and it does so using a different approach than native loads: since the true holder is not available, we create a newshared_ptr
that manages a reference to the Python object. (If the object supportsenable_shared_from_this
, then we use that to locate the true holder and take another reference to it.)Suggested changelog entry:
When loading an instance of a pybind11 class
T
whose type info is not available to the module that's loading it -- i.e., if it's a module-local type defined in a different module, or a type defined by a different version of pybind11 and used through the conduit mechanism -- pybind11 can now populate astd::shared_ptr<T>
, in addition to the previous support for a rawT*
. Note that unlessT
implementsenable_shared_from_this
, the resultingshared_ptr
will own a new reference to the underlying Python object, rather than sharing ownership with the C++ object inside the Python object.📚 Documentation preview 📚: https://pybind11--5862.org.readthedocs.build/