Skip to content

[BUG] xTaskNotifyWait & ulTaskNotifyTake perform a non-deterministic operation in a critical section. #612

@karver8

Description

@karver8

Describe the bug
The API functions xTaskNotifyWait(), ulTaskNotifyTake() and their indexed variants both make calls to prvAddCurrentTaskToDelayedList() from the inside of a critical section, which in turn calls vListInsert() on pxOverflowDelayedTaskList or pxDelayedTaskList depending on the current tick count(unless xTicksToWait equals portMAX_DELAY). This results in the kernel walking the list from inside the critical section, which does not happen anywhere else in the library.

This appears to break the rule that "FreeRTOS never performs a non-deterministic operation, such as walking a linked list, from inside a critical section or interrupt. ". I found it a little hard to believe something like this would be missed, due to how many eyes are on this code base, so I am half expecting there to be a reason that this rule does not apply. If there is a reason that this rule does not apply here, I think it should be documented.

I searched around quite a bit for any information on this, and I finally came across this post from 2018 with the same concern, which gave me the confidence to make this bug report.

If this is in fact an error, there is one more related issue with a comment inside the function vTaskPlaceOnEventList() :

/* THIS FUNCTION MUST BE CALLED WITH EITHER INTERRUPTS DISABLED OR THE
 * SCHEDULER SUSPENDED AND THE QUEUE BEING ACCESSED LOCKED. */

However this function makes a call to vListInsert() and prvAddCurrentTaskToDelayedList(), making this function non-deterministic, although there does not appear to be anything that calls this function with interrupts disabled. The comment should be changed to:

/* THIS FUNCTION MUST BE CALLED WITH THE SCHEDULER SUSPENDED AND THE QUEUE BEING ACCESSED LOCKED. */

This potentially impacts all systems that use task notifications with block times other than 0 and portMAX_DELAY, and were relying on bounded interrupt latency. This issue may also be present in SafeRTOS.

Solution
It seems simple to fix this by just suspending the scheduler instead, many other portions of the kernel add tasks to the delayed tasks list doing exactly this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions