Skip to content

Commit 36e4f18

Browse files
authored
Fix taskq NULL pointer dereference on timer race
Remove unsafe timer_pending() check in taskq_cancel_id() that created a race where: - Timer expires and timer_pending() returns FALSE - task_done() frees task with tqent_func = NULL - Timer callback executes and queues freed task - Worker thread crashes executing NULL function Always call timer_delete_sync() unconditionally to ensure timer callback completes before task is freed. Reliably reproducible by injecting mdelay(10) after setting CANCEL flag to widen the race window, combined with frequent task cancellations (e.g., snapshot automount expiry). Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #17942
1 parent 71609a9 commit 36e4f18

File tree

1 file changed

+24
-7
lines changed

1 file changed

+24
-7
lines changed

module/os/linux/spl/spl-taskq.c

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -633,14 +633,31 @@ taskq_cancel_id(taskq_t *tq, taskqid_t id)
633633

634634
/*
635635
* The task_expire() function takes the tq->tq_lock so drop
636-
* drop the lock before synchronously cancelling the timer.
636+
* the lock before synchronously cancelling the timer.
637+
*
638+
* Always call timer_delete_sync() unconditionally. A
639+
* timer_pending() check would be insufficient and unsafe.
640+
* When a timer expires, it is immediately dequeued from the
641+
* timer wheel (timer_pending() returns FALSE), but the
642+
* callback (task_expire) may not run until later.
643+
*
644+
* The race window:
645+
* 1) Timer expires and is dequeued - timer_pending() now
646+
* returns FALSE
647+
* 2) task_done() is called below, freeing the task, sets
648+
* tqent_func = NULL and clears flags including CANCEL
649+
* 3) Timer callback finally runs, sees no CANCEL flag,
650+
* queues task to prio_list
651+
* 4) Worker thread attempts to execute NULL tqent_func
652+
* and panics
653+
*
654+
* timer_delete_sync() prevents this by ensuring the timer
655+
* callback completes before the task is freed.
637656
*/
638-
if (timer_pending(&t->tqent_timer)) {
639-
spin_unlock_irqrestore(&tq->tq_lock, flags);
640-
timer_delete_sync(&t->tqent_timer);
641-
spin_lock_irqsave_nested(&tq->tq_lock, flags,
642-
tq->tq_lock_class);
643-
}
657+
spin_unlock_irqrestore(&tq->tq_lock, flags);
658+
timer_delete_sync(&t->tqent_timer);
659+
spin_lock_irqsave_nested(&tq->tq_lock, flags,
660+
tq->tq_lock_class);
644661

645662
if (!(t->tqent_flags & TQENT_FLAG_PREALLOC))
646663
task_done(tq, t);

0 commit comments

Comments
 (0)