Skip to content

Make assertions thread-safe #2948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

jeremy-rifkin
Copy link
Contributor

Description

This PR makes Catch2 assertion and logging macros thread-safe, allowing threads spawned in test cases to perform checks. It does this by introducing a global lock and locking at entry points called from these macros. This was easier and safer than trying to track down every use of static storage in the library and locking only around those, however, at some point that might be desirable.

A std::recursive_mutex is used for the global lock. There is some overhead associated with taking out the lock, however, it's extremely minimal in the uncontested (single-thread) case. In a benchmark locally I found that for a debug build on linux the overhead of the lock is around 300ns on my machine, meaning that a user would have to perform a million assertions in order to add a second of run-time overhead to their program.

benchmark name                       samples       iterations    est run time
                                     mean          low mean      high mean
                                     std dev       low std dev   high std dev
-------------------------------------------------------------------------------
no lock                                        100          1604     5.9348 ms 
                                        37.1058 ns     36.915 ns    37.4472 ns 
                                        1.26649 ns    0.80055 ns    1.98121 ns 
                                                                               
with std::recursive_mutex                      100           203     6.0088 ms 
                                        303.559 ns    299.777 ns    313.273 ns 
                                        28.6615 ns    11.6045 ns    56.3121 ns 

On release the overhead is far smaller:

benchmark name                       samples       iterations    est run time
                                     mean          low mean      high mean
                                     std dev       low std dev   high std dev
-------------------------------------------------------------------------------
no lock                                        100         20405     2.0405 ms 
                                        1.29613 ns    1.28875 ns    1.31015 ns 
                                      0.0501252 ns  0.0300847 ns  0.0787845 ns 
                                                                               
with std::recursive_mutex                      100           226      2.712 ms 
                                        137.018 ns    128.907 ns    150.877 ns 
                                        53.1962 ns    37.1166 ns    85.6079 ns 

MSVC debug:

benchmark name                       samples       iterations    est run time
                                     mean          low mean      high mean
                                     std dev       low std dev   high std dev
-------------------------------------------------------------------------------
no lock                                        100          6510      1.302 ms 
                                        2.07343 ns     2.0573 ns    2.10123 ns 
                                       0.105615 ns  0.0698496 ns    0.14992 ns

with std::recursive_mutex                      100           567     1.5309 ms 
                                        25.3474 ns    24.8342 ns    27.2275 ns 
                                        4.45937 ns    1.13563 ns    10.2695 ns

If this impact is deemed too high I have ideas for reducing the overhead in debug mode and optimizing for the uncontested case.

GitHub Issues

#99
#246
#875
#1043
#1169
#1252
#1302
#1714
#1904
#2641
And probably others

Copy link

codecov bot commented Jan 18, 2025

Codecov Report

Attention: Patch coverage is 96.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 90.96%. Comparing base (f7968e9) to head (221d5ef).
Report is 20 commits behind head on devel.

Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #2948      +/-   ##
==========================================
- Coverage   90.98%   90.96%   -0.02%     
==========================================
  Files         199      201       +2     
  Lines        8612     8635      +23     
==========================================
+ Hits         7835     7854      +19     
- Misses        777      781       +4     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

evaleev added a commit to ValeevGroup/SeQuant that referenced this pull request Feb 21, 2025
…y of Catch2 macros

TODO: revert this ugliness when catchorg/Catch2#2948 is merged
@jeremy-rifkin
Copy link
Contributor Author

Hi @shahsb, I am awaiting further feedback from the maintainer who I have been in contact with regarding this PR over discord. I'm a bit confused by your reviews, are you involved with the Catch2 project?

Copy link

@RedBeard0531 RedBeard0531 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot possibly be sufficient. As just one missing case:

#define INTERNAL_CATCH_IF( macroName, resultDisposition, ... ) \
INTERNAL_CATCH_TEST( macroName, resultDisposition, __VA_ARGS__ ); \
if( Catch::getResultCapture().lastAssertionPassed() )

It may be better to have lastAssertionPassed (or possibly the whole ResultCapture) be thread-local.

m_resultCapture.notifyAssertionStarted( m_assertionInfo );
}

AssertionHandler::~AssertionHandler() {
auto lock = take_global_lock();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved inside the if or is m_completed written to from other threads?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handler is on stack, so multiple threads shouldn't be writing to it.

@jeremy-rifkin
Copy link
Contributor Author

Thanks @RedBeard0531, good point. Catch2 has a ton of internal state, this is hard to track down :)

It does look like it might be sufficient to make m_lastAssertionPassed thread-local. I'm also noticing there are some counters like m_totals.assertions.passed which should be made atomic.

@@ -480,6 +484,7 @@ void ConsoleReporter::benchmarkPreparing( StringRef name ) {
}

void ConsoleReporter::benchmarkStarting(BenchmarkInfo const& info) {
auto lock = take_global_lock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locking in ConsoleReporter is suspect, because nothing enforces that the users use it. It should be done further up the callstack, e.g. in run_context.

void AssertionHandler::handleExpr( ITransientExpression const& expr ) {
auto lock = take_global_lock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these should live in the capture, because it already handles things like assertion fast-path if no reporters have to be notified about a passing assertion.

Speaking of, the current fast path for assertions is a simple counter increment without invoking the reporter. We should be able to have similar fast path (with an atomic counter) after this.

@horenmar horenmar force-pushed the jr/thread-safe-assertions branch from 58137e5 to 221d5ef Compare July 16, 2025 12:34
@catchorg catchorg deleted a comment from shahsb Jul 16, 2025
@catchorg catchorg deleted a comment from shahsb Jul 16, 2025
@catchorg catchorg deleted a comment from shahsb Jul 17, 2025
@catchorg catchorg deleted a comment from shahsb Jul 17, 2025
@catchorg catchorg deleted a comment from shahsb Jul 17, 2025
@horenmar
Copy link
Member

Update on this: I started making some changes, realized that bigger refactoring is needed and started doing that in the main repo instead.

The good part is that the refactoring improved runtime of single-threaded assertions to the point where current devel is almost 2x faster than the last release at handling passing assertions that do not need to be reported and thus can go into the fast path. (The slow path has seen ~5-10% improvement)

The bad part is that I also implemented thread safe assertions (messages not supported for simplicity) in branch devel-thread-experiments. The design should be correct and so far the testing bears this out, but it introduces over 40% slowdown compared to current devel.

The final result will probably be that the thread safe assertions will live behind a compile-time configuration option, because the perf cost is too big.

horenmar added a commit that referenced this pull request Jul 23, 2025
All the previous refactoring to make the assertion fast paths
smaller and faster also allows us to implement the fast paths
just with thread-local and atomic variables, without full mutexes.

However, the performance overhead of thread-safe assertions is
still significant for single threaded usage:

|  slowdown |  Debug  | Release |
|-----------|--------:|--------:|
| fast path |   1.04x |   1.43x |
| slow path |   1.16x |   1.22x |

Thus, we don't make the assertions thread-safe by default, and instead
provide a build-time configuration option that the users can set to get
thread-safe assertions.

This commit is functional, but it still needs some follow-up work:
 * We do not need full seq_cst increments for the atomic counters,
   and using weaker ones can be faster.
 * We brute-force updating the reporter-friendly totals from internal
   atomic counters by doing it everywhere. We should properly trace
   where this is needed instead.
 * Message macros (`INFO`, `UNSCOPED_INFO`, `CAPTURE`, etc) are not
   made thread safe in this commit, but they can be made thread safe
   in the future, by building on top of this work.
 * Add more tests, including with thread-sanitizer, and compiled
   examples to the repository. Right now, these changes have been
   compiled with tsan manually, but these tests are not added to CI.

Closes #2948
horenmar added a commit that referenced this pull request Jul 23, 2025
All the previous refactoring to make the assertion fast paths
smaller and faster also allows us to implement the fast paths
just with thread-local and atomic variables, without full mutexes.

However, the performance overhead of thread-safe assertions is
still significant for single threaded usage:

|  slowdown |  Debug  | Release |
|-----------|--------:|--------:|
| fast path |   1.04x |   1.43x |
| slow path |   1.16x |   1.22x |

Thus, we don't make the assertions thread-safe by default, and instead
provide a build-time configuration option that the users can set to get
thread-safe assertions.

This commit is functional, but it still needs some follow-up work:
 * We do not need full seq_cst increments for the atomic counters,
   and using weaker ones can be faster.
 * We brute-force updating the reporter-friendly totals from internal
   atomic counters by doing it everywhere. We should properly trace
   where this is needed instead.
 * Message macros (`INFO`, `UNSCOPED_INFO`, `CAPTURE`, etc) are not
   made thread safe in this commit, but they can be made thread safe
   in the future, by building on top of this work.
 * Add more tests, including with thread-sanitizer, and compiled
   examples to the repository. Right now, these changes have been
   compiled with tsan manually, but these tests are not added to CI.

Closes #2948
@horenmar horenmar closed this in 2a8a8a7 Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants