Skip to content

[Chore][Master] Quietly exit WorkerGroupDispatcher loop on interrupt#18240

Merged
ruanwenjun merged 3 commits into
apache:devfrom
ruanwenjun:fix-worker-group-dispatcher-shutdown-noise
May 10, 2026
Merged

[Chore][Master] Quietly exit WorkerGroupDispatcher loop on interrupt#18240
ruanwenjun merged 3 commits into
apache:devfrom
ruanwenjun:fix-worker-group-dispatcher-shutdown-noise

Conversation

@ruanwenjun
Copy link
Copy Markdown
Member

Was this PR generated or assisted by AI?

YES, ops 4.7

Purpose of the pull request

Brief change log

WorkerGroupDispatcher#run consumed TaskDispatchableEventBus#take() which was annotated with @SneakyThrows, so an InterruptedException raised when the master shuts down (the dispatch thread is parked on the queue) was rethrown as a RuntimeException and surfaced with a full stack trace — alarming "thread died" noise during a perfectly graceful shutdown.

Drop @SneakyThrows from take() so it declares InterruptedException, and let the dispatch loop catch it: re-set the interrupt flag, log a single info line, and return so the daemon thread exits cleanly.

Also clamp the dispatch-retry waiting time to >= 1s so a freshly-counted failure does not immediately re-enqueue the task against the same unhealthy worker group.

In addition, document how to run dolphinscheduler-master tests in the module's CLAUDE.md: no Docker required, watch out for stale JaCoCo classes, surefire forks 4 JVMs in parallel, and the trailing "kill self fork JVM ... 30 seconds after System.exit(0)" line is a harmless warning.

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(or)

Pull Request Notice

Pull Request Notice

If your pull request contains incompatible change, you should also add it to docs/docs/en/guide/upgrade/incompatible.md

WorkerGroupDispatcher#run consumed TaskDispatchableEventBus#take() which
was annotated with @SneakyThrows, so an InterruptedException raised when
the master shuts down (the dispatch thread is parked on the queue) was
rethrown as a RuntimeException and surfaced with a full stack trace —
alarming "thread died" noise during a perfectly graceful shutdown.

Drop @SneakyThrows from take() so it declares InterruptedException, and
let the dispatch loop catch it: re-set the interrupt flag, log a single
info line, and return so the daemon thread exits cleanly.

Also clamp the dispatch-retry waiting time to >= 1s so a freshly-counted
failure does not immediately re-enqueue the task against the same
unhealthy worker group.

In addition, document how to run dolphinscheduler-master tests in the
module's CLAUDE.md: no Docker required, watch out for stale JaCoCo
classes, surefire forks 4 JVMs in parallel, and the trailing
"kill self fork JVM ... 30 seconds after System.exit(0)" line is a
harmless warning.
Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot removed the test label May 10, 2026
@sonarqubecloud
Copy link
Copy Markdown

@ruanwenjun ruanwenjun merged commit 86c6844 into apache:dev May 10, 2026
122 of 123 checks passed
@ruanwenjun ruanwenjun deleted the fix-worker-group-dispatcher-shutdown-noise branch May 10, 2026 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants