-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
configurable timeout for ovsdb connection #2790
Conversation
currently ovsdb connection timeout is hardcoded to 20s, it might not be sufficient when running large scale test. this commit makes the timeout value configurable during startup by specifying arg `--ovsdb-connect-timeout`, e.g. `--ovsdb-connect-timeout 15s` scripts and spec templates were also updated, so that the timeout value can be set through `daemonset.sh`, e.g. `./daemonset.sh --ovsdb-connect-timeout=25s` Signed-off-by: xqu <[email protected]>
MonitorAll: true, | ||
LFlowCacheEnable: true, | ||
RawClusterSubnets: "10.128.0.0/14/23", | ||
OVSDBConnectTimeout: types.OVSDBTimeout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like some spaces and orders were changed. I would avoid changing anything than what the patch intend to do. It helps the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @dougsland, only some spaces were added by gofmt, because the new field OVSDBConnectTimeout
is longer than the rest ones, gofmt added spaces for all the fields.
I agree with you, but for this case, if I want to avoid the space change, I need either shorten the field name to make it less meaningful, or break gofmt convention, adding spaces seem more acceptable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to me it's fine. I just noticed a bunch of spaces added :) lgtm
@jxiaobin do you see this on the initial connection, or on reconnections? The reason I ask is that we know reconnections are sub-optimal because libovsdb doesn't yet use the LastTransactionID feature of MonitorCondSince. That's being worked on in ovn-org/libovsdb#283 and will very likely solve the problem for reconnect. When I did scale tests, the majority of the time in reconnects is taken by parsing and processing the JSON received from the DB, and that will be greatly reduced by the libovsdb PR above. So hopefully when that PR lands, we won't need a configurable timeout? |
@dcbw it was seen mostly during reconnection but it is also possible that the initial connection could take a long time, right? I agree with you that MonitorCondSince would help reconnection fast in most cases, but there are still exceptions, e.g. when transaction history on the server happen to be discarded before reconnecting (if the transaction rate is very high), and it would fall back to download the whole data. So I think it may still be good to have it configurable instead of hardcoded, and of course make sure the default value is good enough for average use cases. What do you think? |
@jxiaobin @hzhou8 I am confused with ths PR..
|
|
LastTransactionID support was added to libovsdb in ovn-org/libovsdb#292 and should be in ovn-kube for a couple months. |
Tim asks whether we should split the timeout apart, so that ConnectTimeout really means "I can open a TCP connection to the DB" that is separate from the actual reading of a large database from the DBserver after connecting. eg, if we're still receiving data and doing work, should that really be a timeout? |
I went through both ovn-kubernetes and libovsdb code, as far as I can see, for reconnect case, the timeout context is shared by both connect and monitor (downloading data). @dcbw could you please advise? |
@dcbw created a libovsdb PR ovn-org/libovsdb#313, PTAL. |
currently ovsdb connection timeout is hardcoded to 20s, it might
not be sufficient when running large scale test.
this commit makes the timeout value configurable during startup
by specifying arg
--ovsdb-connect-timeout
, e.g.--ovsdb-connect-timeout 15s
scripts and spec templates were also updated, so that the timeout
value can be set through
daemonset.sh
, e.g../daemonset.sh --ovsdb-connect-timeout=25s
Signed-off-by: xqu [email protected]