-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Implement scale-to-zero support for BudAIScaler #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add ability to scale workloads down to 0 replicas when there is no demand, and scale back up when demand is detected via external metrics. Changes: - Add ScaleToZeroConfig to CRD with enabled, activationScale, and gracePeriod fields - Add ZeroDemandSince status field for grace period tracking - Extend ScalingContext and ScalingContextProvider interfaces - Update metrics collector to handle zero-pod scenarios (external sources work without pods) - Update BudScaler and KPA algorithms to respect scale-to-zero config - Add grace period logic to prevent premature scale-to-zero - Add activation scale logic when waking from zero - Track zero demand state in controller Co-Authored-By: Claude Opus 4.5 <[email protected]>
Summary of ChangesHello @dittops, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the BudAIScaler by introducing comprehensive support for scaling services down to zero replicas. It extends the Custom Resource Definition (CRD) to provide configurable options for this behavior, such as a grace period before scaling down and an activation scale for efficient wake-up. The core scaling algorithms have been adapted to respect these new configurations, ensuring that services can dynamically adjust their replica count, including complete shutdown during periods of inactivity, and then scale back up effectively when demand reappears. The changes also refine metric collection to handle zero-pod states, making the autoscaler more robust and resource-efficient. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces comprehensive scale-to-zero functionality for the BudAIScaler, allowing workloads to scale down to zero replicas and efficiently scale back up when demand returns. The changes include adding ScaleToZeroConfig to the CRD, tracking ZeroDemandSince in the status, and updating both BudScaler and KPA algorithms to incorporate grace periods and activation scales. The metrics collector has also been enhanced to handle zero-pod scenarios gracefully. The implementation is consistent across the codebase and includes appropriate CRD definitions, deepcopy logic, and context propagation. The test cases for scaling policies have been updated to reflect the new scale-to-zero capability, which is a good practice.
| activationScale := sctx.GetActivationScale() | ||
| if activationScale > rec.DesiredReplicas { | ||
| rec.DesiredReplicas = activationScale | ||
| rec.Reason += fmt.Sprintf(" (activation scale: %d)", activationScale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason string for activation scale is appended to the existing reason. While functional, consider if it would be clearer to construct a more descriptive reason that fully explains the decision, rather than just appending. For example, if the original reason was "Zero demand detected, starting grace period", appending " (activation scale: 2)" might be slightly less clear than a new reason like "Scaling from zero with activation scale: 2". This is a minor readability suggestion.
| rec.Reason += fmt.Sprintf(" (activation scale: %d)", activationScale) | |
| rec.Reason = fmt.Sprintf("Scaling from zero with activation scale: %d (original reason: %s)", activationScale, rec.Reason) |
| rec.DesiredReplicas = activationScale | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the BudScaler algorithm, consider adding a more explicit reason for applying the activation scale. While the current implementation correctly sets the desired replicas, a clearer reason would improve observability and debugging for users trying to understand why their workload scaled to a specific number from zero.
| rec.DesiredReplicas = activationScale | |
| } | |
| rec.Reason = fmt.Sprintf("Scaling from zero with activation scale: %d", activationScale) |
| if result.Recommendation != nil { | ||
| if result.Recommendation.DesiredReplicas == 0 && result.CurrentReplicas > 0 { | ||
| // Entering zero demand state | ||
| if scaler.Status.ZeroDemandSince == nil { | ||
| scaler.Status.ZeroDemandSince = &now | ||
| } | ||
| } else if result.Recommendation.DesiredReplicas > 0 { | ||
| // Exiting zero demand state | ||
| scaler.Status.ZeroDemandSince = nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for updating ZeroDemandSince is correct. However, now is defined once at the beginning of the reconcileCustomScaler function. If there's a significant delay between the start of reconciliation and this specific status update, now might not accurately reflect the exact moment the zero-demand state was detected. For higher precision, consider moving the now := metav1.Now() assignment directly before scaler.Status.ZeroDemandSince = &now when setting the timestamp.
| if result.Recommendation != nil { | |
| if result.Recommendation.DesiredReplicas == 0 && result.CurrentReplicas > 0 { | |
| // Entering zero demand state | |
| if scaler.Status.ZeroDemandSince == nil { | |
| scaler.Status.ZeroDemandSince = &now | |
| } | |
| } else if result.Recommendation.DesiredReplicas > 0 { | |
| // Exiting zero demand state | |
| scaler.Status.ZeroDemandSince = nil | |
| } | |
| if result.Recommendation != nil { | |
| if result.Recommendation.DesiredReplicas == 0 && result.CurrentReplicas > 0 { | |
| // Entering zero demand state | |
| if scaler.Status.ZeroDemandSince == nil { | |
| currentTimestamp := metav1.Now() | |
| scaler.Status.ZeroDemandSince = ¤tTimestamp | |
| } | |
| } else if result.Recommendation.DesiredReplicas > 0 { | |
| // Exiting zero demand state | |
| scaler.Status.ZeroDemandSince = nil | |
| } | |
| } |
- Improve activation scale reason strings in budscaler and kpa algorithms - Use fresh timestamp for ZeroDemandSince instead of reusing `now` variable Co-Authored-By: Claude Opus 4.5 <[email protected]>
Summary
ScaleToZeroConfigto CRD withenabled,activationScale, andgracePeriodfieldsZeroDemandSincestatus field for grace period trackingUsage Example
Test plan
go test ./...)make generate && make manifests)go build ./...)🤖 Generated with Claude Code