-
Couldn't load subscription status.
- Fork 118
Add time-based cluster scheduling feature #781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
Reviewer's GuideThis PR introduces an optional time-based scheduling module that parses UNIX cron expressions at startup, uses a ScheduledExecutorService to periodically evaluate each cluster’s schedule against the current time (with configurable timezones), and invokes the GatewayBackendManager to activate or deactivate clusters based on cron matches and an Sequence diagram for periodic cluster activation/deactivationsequenceDiagram
participant Scheduler as ClusterScheduler
participant BackendManager as GatewayBackendManager
participant Config as ScheduleConfiguration
participant Cluster as ProxyBackendConfiguration
Scheduler->>Config: Get schedules and timezone
Scheduler->>Scheduler: Parse cron expressions
Scheduler->>Scheduler: Start periodic check (ScheduledExecutorService)
loop Every checkInterval
Scheduler->>Config: For each ClusterSchedule
Scheduler->>Scheduler: Evaluate cron match for current time
Scheduler->>BackendManager: Get cluster by name
alt Cluster found
Scheduler->>Cluster: Check current active state
alt State needs change
Scheduler->>BackendManager: activateBackend()/deactivateBackend()
else State unchanged
Scheduler->>Scheduler: No action
end
else Cluster not found
Scheduler->>Scheduler: Log warning
end
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `gateway-ha/src/main/java/io/trino/gateway/ha/config/ScheduleConfiguration.java:74-77` </location>
<code_context>
+ }
+
+ @JsonProperty
+ public void setSchedules(List<ClusterSchedule> schedules)
+ {
+ this.schedules = schedules;
</code_context>
<issue_to_address>
**suggestion:** Consider defensive copying of the schedules list in the setter.
Assigning the list directly exposes internal state. Creating a copy (e.g., new ArrayList<>(schedules)) protects against external changes.
```suggestion
public void setSchedules(List<ClusterSchedule> schedules)
{
this.schedules = (schedules == null) ? null : new ArrayList<>(schedules);
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
gateway-ha/src/main/java/io/trino/gateway/ha/config/ScheduleConfiguration.java
Show resolved
Hide resolved
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
de5cb2b to
ffbc046
Compare
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
|
ps. Check on theses docs for |
|
dang, the sequence diagram from sourcery.ai is 💯 |
|
To answer some of your motivations behind this change: We used to let Trino Gateway handle activation/deactivation of the clusters but then we found some issues
Therefore, we've decided to disable automatic activation/deactivation from Trino Gateway completely since it's a very important switch that only admins should decide when to make the decision. Even if in the ideal world we fix these issues, it would still be easier and more straightforward to just shut down the clusters instead of "deactivating" from Trino Gateway because at the end of the day Trino Gateway is just routing requests to the clusters. If you want to optimize cost, you should just turn off the clusters. Blinding Trino Gateway from the cluster isn't going to optimize your cost We talked about dynamically bringing up Trino clusters during peak hours and killing them during non-peak hours, but this feature was not implemented. It may not even be a good feature in Trino Gateway. IMO, the best approach is to have some sort of monitoring system that monitors your traffic, brings up more Trino clusters if needed, and registers the Trino clusters to Trino Gateway. For some context/background knowledge, current Trino Gateway has a health check that checks Trino clusters' health. It can be tl;dr, if you shut down the Trino cluster, Trino Gateway will not deactivate the cluster, but will mark the cluster as |
|
Thanks for the detailed context, helps clarify the rationale behind disabling automatic activation/deactivation in Gateway. To clarify, this feature does not start or stop Trino clusters. It simply provides time-based routing control, letting Gateway temporarily include or exclude clusters from routing based on configurable schedules. This aligns Gateway behavior with autoscaling systems that already bring clusters up/down during predictable windows, reducing unnecessary routing attempts or failures. From testing cluster lifecycle management relying solely on HEALTHY/UNHEALTHY states, I observed a few challenges:
We are already using this mechanism and helped mitigate both by coordinating routing availability with known cluster schedules. |
Description
This PR introduces time-based cluster scheduling to Trino Gateway, allowing administrators to automatically activate and deactivate clusters based on cron schedules. This feature is particularly useful for:
This feature is implemented as a separate, optional module that is not included by default — it can be added on-demand for deployments that require time-based scheduling capabilities.
Key features:
activeDuringCronflagExample configuration:
Additional context and related issues
The scheduler works by:
cronutilslibraryactiveDuringCronflag (determines if cluster should be active during or outside the schedule)Implementation details:
ScheduledExecutorServicefor reliable schedulingConcurrentHashMapfor execution timesRelease notes
( ) This is not user-visible or is docs only, and no release notes are required.
(x) Release notes are required, with the following suggested text:
Summary by Sourcery
Implement an optional cron-based cluster scheduler that parses schedules from configuration, evaluates them at fixed intervals, and uses the backend manager to toggle routing for clusters according to defined time windows
New Features:
Enhancements:
Build:
Documentation:
Tests: