Skip to content

Fix GrpcWorkerChannel.StartWorkerProcessAsync timeout #10937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

jviau
Copy link
Contributor

@jviau jviau commented Mar 21, 2025

Issue describing the changes in this PR

resolves #issue_for_this_pr

Pull request checklist

IMPORTANT: Currently, changes must be backported to the in-proc branch to be included in Core Tools and non-Flex deployments.

  • Backporting to the in-proc branch is not required
    • Otherwise: Link to backporting PR
  • My changes do not require documentation changes
    • Otherwise: Documentation issue linked to PR
  • My changes should not be added to the release notes for the next release
    • Otherwise: I've added my notes to release_notes.md -- TODO
  • My changes do not need to be backported to a previous version
    • Otherwise: Backport tracked by issue/PR #issue_or_pr
  • My changes do not require diagnostic events changes
    • Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
  • I have added all required tests (Unit tests, E2E tests) -- TODO

Additional information

This PR improves the ScriptHost startup experience with a bad worker. Today, if a worker crashes or exits immediately after startup, then the GrpcWorkerChannel.StartWorkerProcessAsync will block on _workerInitTask.Task until it eventually times out. This tends to fault the entire host (at least during debugging).

To address this, a WorkerProcess.WaitForExitAsync is added and GrpcWorkerChannel.StartWorkerProcessAsync will also wait on that, improving the responsiveness to a worker exiting before connecting gRPC events.

@jviau jviau requested a review from a team as a code owner March 21, 2025 21:25
await _rpcWorkerProcess.StartProcessAsync();
_state = _state | RpcWorkerChannelState.Initializing;
await _workerInitTask.Task;
await _rpcWorkerProcess.StartProcessAsync(cancellationToken);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the important change - we will now wait on either worker fully initialized (gRPC connection established) or worker exits (in which case, we will re-throw any failures the worker experience).

Copy link
Member

@brettsam brettsam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good -- would just like a test added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants