-
Notifications
You must be signed in to change notification settings - Fork 145
Bpf task work #9749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Bpf task work #9749
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ddfc9fa to
818631d
Compare
acd6664 to
fd06203
Compare
7a52608 to
326d428
Compare
fd06203 to
ae8a7ef
Compare
eddc6bf to
0ef7fc8
Compare
aeccac7 to
ae8a6a7
Compare
0ef7fc8 to
1bd67e0
Compare
Reduce code duplication in detection of the known special field types in map values. This refactoring helps to avoid copying a chunk of code in the next patch of the series. Signed-off-by: Mykyta Yatsenko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Acked-by: Andrii Nakryiko <[email protected]>
Refactor the verifier by pulling the common logic from process_timer_func() into a dedicated helper. This allows reusing process_async_func() helper for verifying bpf_task_work struct in the next patch. Signed-off-by: Mykyta Yatsenko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Tested-by: [email protected]
Extract the cleanup of known embedded structs into the dedicated helper. Remove duplication and introduce a single source of truth for freeing special embedded structs in hashtab. Signed-off-by: Mykyta Yatsenko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]>
The verifier currently enforces a zero return value for all async callbacks—a constraint originally introduced for bpf_timer. That restriction is too narrow for other async use cases. Relax the rule by allowing non-zero return codes from async callbacks in general, while preserving the zero-return requirement for bpf_timer to maintain its existing semantics. Signed-off-by: Mykyta Yatsenko <[email protected]> Acked-by: Eduard Zingerman <[email protected]>
This patch adds necessary plumbing in verifier, syscall and maps to support handling new kfunc bpf_task_work_schedule and kernel structure bpf_task_work. The idea is similar to how we already handle bpf_wq and bpf_timer. verifier changes validate calls to bpf_task_work_schedule to make sure it is safe and expected invariants hold. btf part is required to detect bpf_task_work structure inside map value and store its offset, which will be used in the next patch to calculate key and value addresses. arraymap and hashtab changes are needed to handle freeing of the bpf_task_work: run code needed to deinitialize it, for example cancel task_work callback if possible. The use of bpf_task_work and proper implementation for kfuncs are introduced in the next patch. Signed-off-by: Mykyta Yatsenko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]>
Calculation of the BPF map key, given the pointer to a value is duplicated in a couple of places in helpers already, in the next patch another use case is introduced as well. This patch extracts that functionality into a separate function. Signed-off-by: Mykyta Yatsenko <[email protected]> Acked-by: Kumar Kartikeya Dwivedi <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]>
Implementation of the new bpf_task_work_schedule kfuncs, that let a BPF
program schedule task_work callbacks for a target task:
* bpf_task_work_schedule_signal() - schedules with TWA_SIGNAL
* bpf_task_work_schedule_resume() - schedules with TWA_RESUME
Each map value should embed a struct bpf_task_work, which the kernel
side pairs with struct bpf_task_work_kern, containing a pointer to
struct bpf_task_work_ctx, that maintains metadata relevant for the
concrete callback scheduling.
A small state machine and refcounting scheme ensures safe reuse and
teardown. State transitions:
_______________________________
| |
v |
[standby] ---> [pending] --> [scheduling] --> [scheduled]
^ |________________|_________
| |
| v
| [running]
|_______________________________________________________|
All states may transition into FREED state:
[pending] [scheduling] [scheduled] [running] [standby] -> [freed]
A FREED terminal state coordinates with map-value
deletion (bpf_task_work_cancel_and_free()).
Scheduling itself is deferred via irq_work to keep the kfunc callable
from NMI context.
Lifetime is guarded with refcount_t + RCU Tasks Trace.
Main components:
* struct bpf_task_work_context – Metadata and state management per task
work.
* enum bpf_task_work_state – A state machine to serialize work
scheduling and execution.
* bpf_task_work_schedule() – The central helper that initiates
scheduling.
* bpf_task_work_acquire_ctx() - Attempts to take ownership of the context,
pointed by passed struct bpf_task_work, allocates new context if none
exists yet.
* bpf_task_work_callback() – Invoked when the actual task_work runs.
* bpf_task_work_irq() – An intermediate step (runs in softirq context)
to enqueue task work.
* bpf_task_work_cancel_and_free() – Cleanup for deleted BPF map entries.
Flow of successful task work scheduling
1) bpf_task_work_schedule_* is called from BPF code.
2) Transition state from STANDBY to PENDING, mark context as owned by
this task work scheduler
3) irq_work_queue() schedules bpf_task_work_irq().
4) Transition state from PENDING to SCHEDULING (noop if transition
successful)
5) bpf_task_work_irq() attempts task_work_add(). If successful, state
transitions to SCHEDULED.
6) Task work calls bpf_task_work_callback(), which transition state to
RUNNING.
7) BPF callback is executed
8) Context is cleaned up, refcounts released, context state set back to
STANDBY.
Signed-off-by: Mykyta Yatsenko <[email protected]>
Reviewed-by: Andrii Nakryiko <[email protected]>
Reviewed-by: Eduard Zingerman <[email protected]>
Acked-by: Kumar Kartikeya Dwivedi <[email protected]>
Introducing selftests that check BPF task work scheduling mechanism. Validate that verifier does not accepts incorrect calls to bpf_task_work_schedule kfunc. Signed-off-by: Mykyta Yatsenko <[email protected]> Acked-by: Eduard Zingerman <[email protected]>
Add stress tests for BPF task-work scheduling kfuncs. The tests spawn multiple threads that concurrently schedule task_work callbacks against the same and different map values to exercise the kfuncs under high contention. Verify callbacks are reliably enqueued and executed with no drops. Signed-off-by: Mykyta Yatsenko <[email protected]>
ae8a6a7 to
c3b21d5
Compare
84ae2d5 to
04d481e
Compare
c199778 to
b0c73f0
Compare
|
Automatically cleaning up stale PR; feel free to reopen if needed |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.