-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding predicate ordering design proposal #1152
adding predicate ordering design proposal #1152
Conversation
7fccc1b
to
1a4e1a0
Compare
|
||
|Position | Predicate | comments (notes, justifications...) | | ||
----------------- | ---------------------------- | ------------------ | ||
| 1 | `CheckNodeConditionPredicate` | we really don’t wanna check predicates against unschedulable nodes. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/wanna/want to/
## End-user ordering | ||
|
||
using scheduling policy file, the cluster admin can override the default static ordering. This gives administrator the maximum flexibility regarding scheduler behaviour and enables scheduler to adapt to cluster usage. as an example the following is scheduler policy file using an end-user ordering: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is order
a positive integer?
Could you please also add that when the provided order
is the same for two or more predicates, scheduler will determine the order and there is no guarantee that the order remains the same over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bsalamat - indeed you're right
@@ -0,0 +1,90 @@ | |||
# predicates ordering | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal should go in the scheduling
directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bsalamat - indeed
"kind" : "Policy", | ||
"apiVersion" : "v1", | ||
"predicates" : [ | ||
{"name" : "PodFitsHostPorts", "order": 2}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just follow the order in configuration? so "order" is not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@k82cn - I'd prefer having a clear ordering here, as using an implicit one might confuse users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also like the explicit order, but we should be backward compatible. So, absence of "order" should be treated as order: 0
. We should mention this in the design doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bsalamat - also maybe add that if there's a non-logical ordering (e.g. order:1, order:2, order:16) default to static ordering with a warning ?
Also I agree that absence of order should be treated as order:0 (i.e. executed at the end & randomly right after the ranked predicates)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is wrong with 1, 2, 16? It shows a clear ordering and should be treated a valid ordering.
Yes for the second part. Order:0 means executed at no particular order and after the ordered predicates. If none of the predicates has an order, then should run with the default order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meant -16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, order must be a positive integer. In case of an invalid value (such as a negative number or a string) we should probably fail to start. Users may miss the warning and not notice the issue in their config. Failing to start is in-line with how scheduler reacts to other config issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
|
||
status: proposal | ||
|
||
Author: Yassine TIJANI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also share your github handle :).
|
||
|
||
|
||
status: proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/status/Status
|
||
## Abstract | ||
|
||
This document describes how and why reordering predicates helps to achieve performance for the kubernetes scheduler. We will expose the motivations behind this proposal, The two steps/solution we see to tackle this problem and the timeline decided to implement these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's general suggestion that: build paragraph multiple lines, so reviewers can comment on each line separately.
|
||
|Position | Predicate | comments (notes, justifications...) | | ||
----------------- | ---------------------------- | ------------------ | ||
| 1 | `CheckNodeConditionPredicate` | we really don’t wanna check predicates against unschedulable nodes. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yastij , as we talked on Slack, prefer to also add something like O(n), n= # of node
. The user can evaluate the complexity of predicates based on their environment; so the lowest complexity with most restrictive come first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@k82cn - I thought that time complexity might be more relevant in the user docs, as most of end-user will refer to the documentation when trying to check time complexity, you think we should add it here instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking a bit more about time complexity, it seems that giving an estimation using the number of nodes might be not very useful, since the vast majority of predicates are computed against every node using the parallelized workqueue with a checkNode function.
@bsalamat @k82cn - to your knowledge is there any common input to predicates that can gives us insights on time complexity ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is any way to find runtime complexity of predicates other than examining their algorithms, but the autoscaler team recently ran some tests to find execution times of various predicates. You can find the results here: kubernetes/autoscaler#257. It gives you some idea about the runtime complexity of the predicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bsalamat - thank you, indeed it should be enough
contributors/design-proposals/predicates-ordering.md, line 40 at r1 (raw file): Previously, yastij (Yassine TIJANI) wrote…
I'd suggest to add it; so when anyone draft the user doc, just copy it. And the time complexity will also help us to work out default order of predicates :). Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions. contributors/design-proposals/predicates-ordering.md, line 67 at r1 (raw file): Previously, yastij (Yassine TIJANI) wrote…
I'm ok with explicit order; please add enough validation for it, e.g. two predicates with same order. Comments from Reviewable |
|
||
## Timeline | ||
|
||
both types of ordering will be delivered as an alpha version, as for the timeline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/b/B/
|
||
## End-user ordering | ||
|
||
using scheduling policy file, the cluster admin can override the default static ordering. This gives administrator the maximum flexibility regarding scheduler behaviour and enables scheduler to adapt to cluster usage. as an example the following is scheduler policy file using an end-user ordering: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/u/U/
|
||
both types of ordering will be delivered as an alpha version, as for the timeline: | ||
|
||
* static ordering: available in 1.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static ordering is fine to do straight away, as it doesn't change how users use clusters, but I think we probably should discuss putting dynamic ordering initially as beta. I'm not sure if it's worth it - we need to discuss it with SIG PM (whatever their handle is).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmarek - I thought that we'd wait for metrics from the community to start working on the dynamic ordering ? are you talking about end-user ordering instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a good idea to add static ordering first and wait for feedback from users before adding dynamic ordering. If users asked for dynamic ordering, we will add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to configurable ordering by end-users as "dynamic ordering". Do we already know customers who need the configurable ordering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from my personal experience and what I heared yes. The only way to do it is to write a custom sched. and change the ordering there.
I think we should have it, for the dynamic one (i.e. ordering automatically predicates depending on cluster usage), we should wait that's why it's not part from this design proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still would like to hear who are the customers who needed it and had to write their own custom scheduler only to reorder predicates and why a smarter ordering (that we are proposing in static ordering of this proposal) is not enough for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are using very specific set of labels, executing checkNodeLabelPresence first is better.
@gmarek - I'll try to join the next SIG PM meeting or talk to someone from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, if the 'static' means 'at startup' and dynamic means 'changes according to the cluster state' then you're right, I understood it differently. It's probably not worth joining the meeting, as they most likely have full agenda, email should work better.
@apsinha @timothysc - do you know if we should use normal alpha-beta-GA (most likely skipping alpha) lifecycle for things like this (allowing user to change the order of predicate computation in scheduler - setting good one improves scheduler's performance, setting terrible one pretty much kills it). It's small enough change from the user perspective, that maybe we can skip this?
We discussed the configurable ordering of predicates in the SIG scheduling meeting yesterday. We believe that the configurable ordering shouldn't be implemented at this point. We can go ahead and build the static ordering and leave the configurable ordering util we get more feedback from our customers that they need it. |
@bsalamat - SGTM |
both types of ordering will be delivered as an alpha version, as for the timeline: | ||
|
||
* static ordering: available in 1.9 | ||
* dynamic ordering: available in 1.10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, could you please change this from 1.10 to "TBD based on customer feedback"?
@gmarek @bsalamat @k82cn - sorry guys i was off, I just came back, I updated the design proposal. Also waiting for feedback from @timothysc and @apsinha on wether static ordering should be delivered as alpha or directly available |
@@ -84,7 +88,7 @@ using scheduling policy file, the cluster admin can override the default static | |||
|
|||
## Timeline | |||
|
|||
both types of ordering will be delivered as an alpha version, as for the timeline: | |||
Both types of ordering will be delivered as an alpha version, as for the timeline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static ordering of predicates shouldn't have noticeable user impact and we don't expect change of behavior. It is more of an optimization. So, I don't think we need to make it alpha. It can go to production directly.
LGTM |
|
||
|
||
|
||
##Abstract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Abstract
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ech... it does md in comments as well. Add space between # and A.
We will expose the motivations behind this proposal, The two steps/solution we see to tackle this problem and the timeline decided to implement these. | ||
|
||
|
||
##Motivation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space.
|
||
|
||
|
||
##Static ordering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space.
418ea9e
to
08ceb2f
Compare
## End-user ordering | ||
|
||
Using scheduling policy file, the cluster admin can override the default static ordering. This gives administrator the maximum flexibility regarding scheduler behaviour and enables scheduler to adapt to cluster usage. | ||
Please note that the order must be a positive integer, also, when providing equal ordering for many predicates, scheduler will determine the order and won't guarantee that the order will remain the same between them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's clear for everyone so we should mention that updating the Policy requires scheduler restart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth noting that point again in this doc, just in case someone is not fully aware of that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
295ddea
to
5a67cb5
Compare
5a67cb5
to
a370117
Compare
a370117
to
65e5d29
Compare
/lgtm |
Automatic merge from submit-queue. |
@bgrant0607 - Indeed forgot to do it, I'll move it asap, thanks ! |
/cc @gmarek @k82cn @bsalamat