Skip to content

thread_local not supported; unsafe to use c++ exceptions in multi-core setting #291

@geurtv

Description

@geurtv
Contributor

pico-sdk doesn't support thread-local storage (thread_local keyword), which is required for c++ exceptions to function properly in a multi-core environment. Internally libstdc++ stores some global exception state in a thread_local variable (static __thread abi::__cxa_eh_globals global; in eh_globals.cc).

If now both RP2040 cores throw or catch exceptions at the same time they are "mixed", because the thread_local state variable is the exact same variable on both cores. There are also no synchronization locks, so both cores can write to the same variable at the same time.

The code below shows how core0's std::current_exception() is overwritten by core1. The expected output is:

[core0] core0 exception
[core1] core1 exception
[core0] core0 exception
[core1] core1 exception

The actual output is:

[core0] core0 exception
[core1] core1 exception
[core0] core1 exception
[core1] core1 exception
#include <cstdio>
#include <stdexcept>
#include <pico/multicore.h>
#include <pico/stdio_usb.h>

void continue_with_other_core(semaphore_t& acquire, semaphore_t& release)
{
    sem_release(&release);
    sem_acquire_blocking(&acquire);
}

void print_current_exception()
{
    try {
        std::rethrow_exception(std::current_exception());
    }
    catch (const std::exception& e) {
        std::printf("[core%d] %s\n", get_core_num(), e.what());
    }
}

void run_test(semaphore_t& acquire, semaphore_t& release)
{
    try {
        sem_acquire_blocking(&acquire);
        throw std::runtime_error("core" + std::to_string(get_core_num()) + " exception");
    }
    catch (...) {
        print_current_exception();
        continue_with_other_core(acquire, release);
        print_current_exception();
        continue_with_other_core(acquire, release);
    }
}

int main()
{
    stdio_usb_init();
    std::getchar();
    std::puts("[start]");

    static semaphore_t sem0, sem1;
    sem_init(&sem0, 1, 1); // unlocked: start with core0
    sem_init(&sem1, 0, 1);

    multicore_launch_core1([] { run_test(sem1, sem0); });
    run_test(sem0, sem1);

    for (;;);
}

Activity

self-assigned this
on Mar 31, 2021
added this to the 1.3.0 milestone on May 31, 2021
geurtv

geurtv commented on Jul 27, 2021

@geurtv
ContributorAuthor

Wrong analysis: static __thread abi::__cxa_eh_globals global is only used when _GLIBCXX_HAVE_TLS is defined. In our case the "normal" static __cxa_eh_globals eh_globals is actually used (by both cores), so obviously a c++ exception on one core then overwrites an active exception on the other core.

By the way, adding thread-local storage support is relatively easy: provide a custom implementation of __emutls_get_address and then use -Wl,--wrap= to use it. Then thread_local works just fine, but as stated that doesn't solve the issue.

geurtv

geurtv commented on Jul 27, 2021

@geurtv
ContributorAuthor

A solution could be to wrap the functions __cxa_get_globals and __cxa_get_globals_fast. One potential problem: the pointer they return is to a libstdc++-internal structure (__cxa_eh_globals), so different GCC versions can have different definitions for it.

I have a proof of principle for gcc-arm-none-eabi-10-2020-q4-major at:

https://github.com/geurtv/pico-sdk/tree/develop

file:
https://github.com/geurtv/pico-sdk/blob/develop/src/rp2_common/pico_standard_link/wrap_eh_globals.cpp

commit:
geurtv@f869eb6

kilograham

kilograham commented on Jul 27, 2021

@kilograham
Contributor

yeah, adding __thread support makes sense (but then reentrant support for lib functions makes sense which probably suggests we should build newlib ourselves with different options compared to the one shipped with gcc-arm-eabi-none)

is __cxa_get_globals in newlib or gcc libs? in any case, yes it would make sense for C++ exceptions to work on both cores (you are the first to notice that they don't... although in fairness we have them turned off by default).

geurtv

geurtv commented on Jul 27, 2021

@geurtv
ContributorAuthor

It's all gcc as far as I can tell. __cxa_get_globals and __cxa_get_globals_fast are both in libsup++, __emutls_get_address (for thread-local storage) is in libgcc.

geurtv

geurtv commented on Jul 27, 2021

@geurtv
ContributorAuthor

FYI: I've also just added a wrapper for __emutls_get_address. Not sure how complete the implementation is, but thread-local storage seems to work now.

https://github.com/geurtv/pico-sdk/blob/develop/src/rp2_common/pico_standard_link/wrap_emutls.c

alastairpatrick

alastairpatrick commented on Sep 9, 2022

@alastairpatrick
Contributor

Would be good if this integrated well with an RTOS so that similar problems don't happen when there are multiple tasks running on one core. For example, FreeRTOS has vTaskSetThreadLocalStoragePointer() and pvTaskGetThreadLocalStoragePointer().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @kilograham@alastairpatrick@geurtv

      Issue actions

        thread_local not supported; unsafe to use c++ exceptions in multi-core setting · Issue #291 · raspberrypi/pico-sdk