-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcdserver: add ability to auto-promote learners to voters #10887
Conversation
Thanks for implementing this feature. The original plan was to not include this in 3.4 release (will likely be in 3.5), but I am open to discussion on this. FYI, the feature cut date for 3.4 release is July 31, accordingly to the latest community meeting. |
Is this still WIP? Is it ready to be reviewed? |
Hey @jingyih it's WIP. I still need to write tests, update documentation and some APIs (e.g. |
Thanks @maxenglander. I will take a closer look asap. |
@maxenglander I went throught the code change, the overall looks good. Can we keep the autoPromote related code in the etcdserver? |
@WIZARD-CXY are you asking if I can undo changes to all packages except |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to decide whether autoPromote is an etcd feature or Raft feature. I don't think changes to Raft package is necessary for implementing this feature. I prefer to keep Raft package having core functionalities only - implement autoPromote only in etcd server.
clientv3/cluster.go
Outdated
MemberAddAsAutoPromotingNode(ctx context.Context, peerAddrs []string) (*MemberAddResponse, error) | ||
|
||
// MemberAddAsNode adds a new member as a node into the cluster. | ||
MemberAddAsNode(ctx context.Context, peerAddrs []string) (*MemberAddResponse, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep MemberAdd() unchanged for better backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done in 29fe958
clientv3/integration/cluster_test.go
Outdated
for _, member := range memberList.Members { | ||
if member.ID == autoPromotingLearnerID { | ||
if member.IsLearner == false { | ||
break success |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better readability, maybe use a flag to record if this member is promoted, so that you can just do 'break' after this for loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done in 29fe958
Hey @jingyih I can make that change, but could benefit from some input on how. I can see two ways of implementing this without changes to Raft package.
The drawback of 1. is that it is a breaking API change. Users might expect learners to remain in that state until manually promoted. The drawback of 2. is that if the leader dies and leadership transfers to another member, a learner that is marked for auto-promotion will lose its auto-promotion designation, and thus be "demoted" to a regular learner, unless there is some (non-Raft) mechanism for having leaders share auto-promote designations with peers. Can you provide input on whether 1. or 2. is preferable? Or perhaps some 3rd option I'm not thinking of? I'd also be interested in understanding what criteria you use to decide that the ability to have add a learner qualifies as a Raft feature, whereas the ability to designate a learner for auto-promotion does not. |
The first step we should do is probably decide the behavior of member add API in v3.5. My own opinion is, in the very long run, 'member add' by default may simply add any new member as learner, which will be auto promoted once caught up. If user wants to add a learner member that will not be auto promoted, they need to specially set a flag. But in the near future (v3.5), the API is not clearly defined yet. (We might also need to implement a proper feature gate and put auto promote behind it) I am happy to start a draft on this so we could start the discussion.
Sorry my previous comment is inaccurate. What I meant was, auto promote's implementation might not need changes to Raft package and current API. etcdserver has cluster member information in |
@jingyih thanks for the comments. I'm happy to wait for you and other maintainers to firm up API plans after which I'm happy to make adjustments to code based on that. FWIW, here was what I was envisioning for API changes from v3.4 and onwards.
I agree, I think this makes a lot of sense. My thinking was that, in v3.4, users would have the option of specifying In >= v3.5, I was thinking Eventually (e.g. in v3.6 or later),
I was thinking that
Let me know if I'm understanding you correctly. I think you're saying that Thanks for all the discussion so far. As I said above, will wait to see what you and other members decide on in terms of v3.5 API, and make code adjustments accordingly. |
@maxenglander I agree with most part of your envisioning. Let's first finalize the API. Maybe just open a PR against the design doc [1] to add a new section 'Learner Implementation in v3.5'? At the same time, I would like to explore the opportunity of possibly adding proper feature gate in etcd. AFAIK, currently in etcd the experimental features are enabled by special server side flags [2]. If not being careful, servers in the same cluster could end up having different settings on these. If we can improve this situation, we can choose to (or not to) put auto promote behind a feature gate, in additional to the '--auto-promote' flag in member add. Another nice thing about having feature gate, is that we can always disable a new feature when it is added, and enable it in the next minor release. This is one step further towards officially support 1 minor version downgrade / rollback. [1] https://github.com/etcd-io/etcd/blob/master/docs/server-learner.rst |
That sounds good @jingyih I will do that. Thank you for the guidance. |
Codecov Report
@@ Coverage Diff @@
## master #10887 +/- ##
===========================================
- Coverage 69.74% 47.12% -22.62%
===========================================
Files 407 398 -9
Lines 32585 32614 +29
===========================================
- Hits 22726 15370 -7356
- Misses 7895 15220 +7325
- Partials 1964 2024 +60
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
df82df7
to
d5bc5de
Compare
ed38329
to
b16b74b
Compare
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
/remove stale |
/remove-stale |
baf146a
to
fdb1712
Compare
Hi @ptabor thanks for reviewing etcd-io/website#107. I thought since you reviewed that I would ping you here to see if you can help move this PR forward. There are probably some additional changes I need to make in this PR, but I think it's a point where it can be reviewed. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
/remove-stale |
@maxenglander I reopened, do you mind rebasing with the main branch, thanks! |
1e29cb3
to
065a53e
Compare
Enable new learners to be marked for automatic promotion to voters upon catching up with the leader. Adds the ability to supply one or more promotion rules when adding a new learnere. Promotion rules govern if and when a learner may be promoted to a new role. Currently only promotion to voter is supported, but the APIs added in this commit are flexible and allow for backwards-compatible introduction of additional promotion targets (e.g. reader).
065a53e
to
bd0cfdb
Compare
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
Auto-promote learners
See #10537 for original mention of auto promoting learners.
See etcd-io/website#107 for changes to learner design doc.
Design is motivated by comments in original design PR:
...and also motivated by other planned 3.5 learner features:
Enables newly added learners to be automatically promoted to voting members.
With this PR, operators can supply "auto" and "promote rules" to member add API. When
--learner
and--auto
are supplied to member add, the learner is automatically promoted to a voting member when any of its promotion rules are satisfied. If no promotion rules are supplied, the "default" promotion rule is used (90% caught up to leader progress).If
--learner
is supplied but not--auto
, the learner is not automatically promoted. Instead, the promotion rules determine whether an operator request to promote the learner is accepted. This implementation supplants and maintains compatibility with the current--learner
behavior.Operators can also supply
--delay
, which determines how long a promotion rule must be satisfied before voter is promoted. The default value is0
, meaning learners may be promoted immediately when a promotion rule is satisfied.With this PR, learners may only be promoted to voters. However, the API changes are flexible enough to accommodate other promotion targets (e.g. "strong reader" or "weak reader"). Likewise, with this PR the only promotion criteria is "--min-progress". The API can easily accommodate additional promotion criteria such as
--min-healthy-voters
(or--min-cluster-size
), which would promote member as soon as the number of healthy voters (or number of members) in cluster drops below a threshold.