Skip to content

Conversation

@nascheme
Copy link

@nascheme nascheme commented Oct 31, 2025

Summary of changes:

  • Add mechanism to set the Py_MOD_GIL_NOT_USED. It works similar to what pybind11 does: add a variadic macro with an optional flag so extensions can opt-in to free-threaded support
  • Replace borrow based APIs with strong reference ones (e.g. PyDict_GetItemRef).
  • Use a mutex to protect global state (e.g. the registry). To avoid deadlocks and keep the code simple, there is a single global recursive lock. We can refine this later (split it into the multiple locks, use critical sections, etc).
  • Fix references to ob_refcnt, don't use Py_REFCNT() as an l-value.
  • Add work-around for crash in ~object_base(). Calling Py_DECREF() with no thread-state is not allowed.

Building this requires a fix to faber. I found building Boost.Python very tricky (likely because I'm not familiar with b2 and faber). I'll share my build scripts in case they help someone else (need to tidy them first).

There is another branch that includes doctests converted to use pytest. I only did a subset of the tests (using Claude to assist in conversion). The goal is to use pytest-run-parallel and try to trigger data races or other issues.

@stefanseefeld
Copy link
Member

stefanseefeld commented Nov 6, 2025

@nascheme I have just pushed a new faber release (0.5.10) including your contributions. Let's see whether they help with this.
(But it seems this needs to be rebased first, to resolve some merge conflicts.)

@nascheme
Copy link
Author

nascheme commented Nov 7, 2025

Thanks for merging the faber changes. My latest push adds the nogil option for BOOST_PYTHON_MODULE_INIT. It could be simplified if we now require C+11 since there is quite a bit of code to support C+98. The code for this follows closely how pybind11 implements the corresponding option.

We might follow what's done by pyo3 in regard to this option and make the default to be true.

I've tested with a free-threaded Python build with thread-sanitizer (TSAN) enabled. Running the pytest unit tests using pytest-run-parallel generates no TSAN warnings. That definitely does not mean we are safe from data races but it gives me some re-assurance we have the right locks in place.

@nascheme nascheme marked this pull request as ready for review November 7, 2025 00:15
@nascheme
Copy link
Author

nascheme commented Nov 7, 2025

For the ~object_base() issue, I found that scipy ran into a similar kind of issue. The comment from Sam Gross:

It's not just a matter of a Py_INCREF to leak them, you don't want to call Py_DECREF from the destructors of variables with global lifetimes. Even if it doesn't trigger the destructor, the Py_DECREF after Python is finalized or without a valid thread state isn't safe, especially in debug builds or the free-threaded build.

The PyThreadState_GetUnchecked() == NULL check is my C programmer fix to this problem. A different option might be to use PyModules's m_free. To me, just leaking some memory here is okay since Python is shutting down anyhow.

@stefanseefeld
Copy link
Member

Thanks for all the context. The PR is still marked as "cannot be rebased due to conflicts". Can you retry rebasing it manually ? (In general, I prefer to rebase a feature branch repeatedly over merging the target branch into the feature branch, as the former allows to fast-forward merge at the end, while the latter doesn't.)

@nascheme
Copy link
Author

nascheme commented Nov 7, 2025

Okay. Are you planning to fast-forward merge without squashing? If so, I'll rebase and also cleanup the commits into more logical ones. If you don't squash them then they could be improved.

@stefanseefeld
Copy link
Member

I don't know yet. I need to understand the changes better, and, if the commits are self-contained and complete, it might actually be better to keep them separate to make it easier to understand (and possibly debug) them.

But at the very least, we need to make them conflict-free to be able to apply and test them.

For the free-threaded build, it is not safe use borrowed references.
Another thread could deallocate the object and cause the reference to
become invalid.  Replace API calls that borrow references with strong
reference APIs.
For the free-threaded build (and possibly the debug build), it is not
safe to call Py_DECREF() if there is no valid Python thread-state.
Add pymutex.hpp which implements a re-entrant mutex on top of Python's
PyMutex.  Add BOOST_PYTHON_LOCK_STATE() macro that uses RAII to lock
mutable global state as required.
This indicates that the free-threaded build of Python can keep the
GIL disabled when the module is loaded.  Without this module flag,
importing the module will cause the GIL to be re-enabled.  A warning
is emitted if this happens.
Implement optional arguments for BOOST_PYTHON_MODULE_INIT and allow the
boost::python::mod_gil_not_used() option.  This sets the
Py_MOD_GIL_NOT_USED flag for the extension module.

To define a module that supports free-threaded Python, define it
like this:

    BOOST_PYTHON_MODULE(my_module, boost::python::mod_gil_not_used())
    {
        ...
    }
@nascheme
Copy link
Author

I rebased the branch and cleaned up each commit. Merging this without squashing would be okay now, IMHO. If you want to split this into multiple PRs, let me know.

The changes are nearly all conditional on Py_GIL_DISABLED and so don't affect the GIL-enabled (default) version of Python. The added locking with PyMutex is conditional on Py_GIL_DISABLED as well and is a non-op in the default build.

@stefanseefeld
Copy link
Member

Thanks for the update. We should then also add at least one CI build variant that uses Py_GIL_DISABLED, to test the changes, shouldn't we ?

@nascheme
Copy link
Author

We should then also add at least one CI build variant that uses Py_GIL_DISABLED, to test the changes, shouldn't we ?

Hello @stefanseefeld, I think that's a good idea. I have a revised CI action script to test with 3.14t as well. If we continue with the matrix approach, the options are:

  • cxx: g++/clang++
  • std: c++11, c++14, c++17
  • python: 3.10, 3.11, 3.12, 3.13, 3.14, 3.14t

that makes a 36 combinations. Too many, I think.

A different idea I had was to have two Linux CI scripts, one to test compiler and language standard combos, with the options being:

  • cxx: g++/clang++
  • std: c++11, c++14, c++17
  • python: 3.14

Then, a second "python versions" CI script, suggested combos:

  • cxx: g++
  • std: c++11
  • python: 3.10, 3.11, 3.12, 3.13, 3.14, 3.14t

Together than makes a total of 12 runs. Another idea would be to set one of the CI script to be "on: workflow_dispatch" and just manually run it before a release. Since the previous CI was only testing with one Python version, maybe this is still overkill? What do you prefer?

@stefanseefeld
Copy link
Member

Hi @nascheme , thanks for following up ! I agree that a full ("dense") matrix is overkill, so we need to look into how to make it "sparse". (In terms of coverage I expect users of the new Python features will likely also use newer compilers and C++ standards, so it might be enough to test the new features with the most recent set of compilers & flags.)
I'll look a bit into different ways of generating build parameters with this yaml language, so perhaps there are ways to write "generators" to produce triplets (cxx, std, python), so we can produce sparse matrices within a single script.

Update scripts to use actions/setup-python to install different Python
versions.  Add run-faber.sh and get-py-env.py scripts.  Add
test-ubuntu-py-ver.yml CI script to test with different Python versions.
@nascheme
Copy link
Author

I've revised the CI workflows to use setup-python. Getting faber to use the correct include and linker flags took some work. Ideally I think faber could be a bit smarter to figure this out on its own (using python3-config).

I changed test-ubuntu to test with 3.14 only. I added test-ubuntu-py-vers which tests with a set of Python versions (3.10 to 3.14).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants