Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodically restart watching DCP resources to avoid event interruptions #7336

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

danegsta
Copy link
Member

Description

It turns out that the Kubernetes client watch call can periodically stop responding if left running long enough (15+ minutes). No cancellation is thrown, but the watch connection simply stops receiving any new resource state from DCP. A simple workaround is to periodically restart the watch stream on a reasonable interval. This does result in duplicate state being received (the latest state will be returned by the watch stream when re-initialized even if that status has already been processed by Aspire, our event handling should be resilient enough to deal with this without an issue).

Fixes #7167

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?

Copy link
Member

@karolz-ms karolz-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, with one caveat

public async Task CancellingMainTokenCancelsEnumerable()
{
using var cts = new CancellationTokenSource();
var enumerable = new PeriodicRestartAsyncEnumerable<int>((cancellationToken) => Task.FromResult(RepeatingAsyncEnumerable(1, TimeSpan.FromMilliseconds(10), cancellationToken)), TimeSpan.FromMilliseconds(750));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider how slow CI VMs can be. I would think seconds, not milliseconds.

/// we re-enable the watch.
/// </summary>
/// <typeparam name="T">The inner enumerated type</typeparam>
internal sealed class PeriodicRestartAsyncEnumerable<T> : IAsyncEnumerable<T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to implement this as a single loop in a function instead? A generator I mean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants