Skip to content

Conversation

kernel-patches-daemon-bpf-rc[bot]
Copy link

Pull request for series with
subject: bpf: Avoid RCU context warning when unpinning htab with internal structs
version: 3
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1009332

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 23f3770
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1009332
version: 3

When unpinning a BPF hash table (htab or htab_lru) that contains internal
structures (timer, workqueue, or task_work) in its values, a BUG warning
is triggered:
 BUG: sleeping function called from invalid context at kernel/bpf/hashtab.c:244
 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0
 ...

The issue arises from the interaction between BPF object unpinning and
RCU callback mechanisms:
1. BPF object unpinning uses ->free_inode() which schedules cleanup via
   call_rcu(), deferring the actual freeing to an RCU callback that
   executes within the RCU_SOFTIRQ context.
2. During cleanup of hash tables containing internal structures,
   htab_map_free_internal_structs() is invoked, which includes
   cond_resched() or cond_resched_rcu() calls to yield the CPU during
   potentially long operations.

However, cond_resched() or cond_resched_rcu() cannot be safely called from
atomic RCU softirq context, leading to the BUG warning when attempting
to reschedule.

Fix this by changing from ->free_inode() to ->destroy_inode() and rename
bpf_free_inode() to bpf_destroy_inode() for BPF objects (prog, map, link).
This allows direct inode freeing without RCU callback scheduling,
avoiding the invalid context warning.

Reported-by: Le Chen <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Fixes: 6813466 ("bpf: Add map side support for bpf timers.")
Suggested-by: Alexei Starovoitov <[email protected]>
Signed-off-by: KaFai Wan <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Add test to verify that unpinning hash tables containing internal timer
structures does not trigger context warnings.

Each subtest (timer_prealloc and timer_no_prealloc) can trigger the
context warning when unpinning, but the warning cannot be triggered
twice within a short time interval (a HZ), which is expected behavior.

Signed-off-by: KaFai Wan <[email protected]>
Acked-by: Yonghong Song <[email protected]>
@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 23f3770
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1009332
version: 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant