You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't know what is the correct solution, but some software (namely Cilium and kvstoremesh) relies on fact checking this ID with each reply and gets stuck in reconnect loop.
I can try doing a dirty patch on either etcd or cilium's side, whichever will be easier for us, but if somebody can send me the proper solution to this - I am all ears.
My only idea is to grab the ClusterId/MemberId/.. from previous responseheader like this
Header: &pb.ResponseHeader{
ClusterId: w.lastHeader.ClusterId,
MemberId: w.lastHeader.MemberId,
Revision: w.nextrev,
// todo: fill in RaftTerm - don't know if that one should be copied but it is not necessarily required in this case:
},
but that looks dirty (but should work as the header is copied on every reply - including this one)
Cheers and thanks for looking into it! It has taken me 3 work days to figure it out :-)
Ashley
What did you expect to happen?
Return ResponseHeader.ClusterId on every reply, but it doesn't with GRPC Proxy
How can we reproduce it (as minimally and precisely as possible)?
Start GRPC proxy and "spam" several instances doing the watch on the same key and revision.
Anything else we need to know?
No response
Etcd version (please run commands below)
$ etcd --versionv3.5.15
$ etcdctl versionv3.5.15
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
Relevant log output
The text was updated successfully, but these errors were encountered:
No that patch wouldn't actually work as the lastHeader is not saved always when forwarded :)
I have a PoC where I "stole" the clusterID from the checkPermissionForWatch as it is called during create of the watch anyways, but that doesn't look nice for the upstream solution :-). I am at a "I need an adult" moment right now because I have not touched ETCD codebase before and don't know what's the best solution here.
Bug report criteria
What happened?
It seems that in certain rare cases the ClusterId is missing from reply in GRPC Proxy.
I managed to track it down (I THINK!) to here https://github.com/etcd-io/etcd/blob/main/server/proxy/grpcproxy/watch_broadcast.go#L113 with
// todo: fill in ClusterId
I don't know what is the correct solution, but some software (namely Cilium and kvstoremesh) relies on fact checking this ID with each reply and gets stuck in reconnect loop.
I can try doing a dirty patch on either etcd or cilium's side, whichever will be easier for us, but if somebody can send me the proper solution to this - I am all ears.
My only idea is to grab the ClusterId/MemberId/.. from previous responseheader like this
but that looks dirty (but should work as the header is copied on every reply - including this one)
Cheers and thanks for looking into it! It has taken me 3 work days to figure it out :-)
Ashley
What did you expect to happen?
Return ResponseHeader.ClusterId on every reply, but it doesn't with GRPC Proxy
How can we reproduce it (as minimally and precisely as possible)?
Start GRPC proxy and "spam" several instances doing the watch on the same key and revision.
Anything else we need to know?
No response
Etcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output
The text was updated successfully, but these errors were encountered: