-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failover support #20
Failover support #20
Conversation
Configuration AzureAppConfigurationKeyValueOptions `json:"configuration,omitempty"` | ||
Secret *AzureKeyVaultReference `json:"secret,omitempty"` | ||
FeatureFlag *AzureAppConfigurationFeatureFlagOptions `json:"featureFlag,omitempty"` | ||
Endpoint *string `json:"endpoint,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation?
if err != nil { | ||
return nil, err | ||
} | ||
manager.secret, err = parseConnectionString(manager.ConnectionString, SecretSection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we keep both secret/id and connectionString? looks like one of them is enough.
return manager, nil | ||
} | ||
|
||
func SrvTargetHostQuery(host string) ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put the verb in the beginning sounds more natural as a method name.
return nil, err | ||
} | ||
|
||
originHost := strings.TrimSuffix(originRecords[0].Target, ".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just want to make sure, the originRecords will never be an empty array. right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each replica has _origin._tcp.*
SRV record been created to indicate the origin store of the replica. If there's no replica of user's appconfig store, the originRecoreds will be empty.
} | ||
} | ||
|
||
altHost := strings.TrimSuffix(altRecords[0].Target, ".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only altRecords[0]
?
currentTime := metav1.Now() | ||
if currentTime.After(manager.lastFallbackClientsAttempt.Time.Add(MinimalClientRefreshInterval)) && | ||
(manager.DynamicClients == nil || | ||
currentTime.After(manager.lastFallbackRefresh.Time.Add(FallbackClientRefreshExpireInterval))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is lastFallbackClientsAttempt
time and lastFallbackRefresh
time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lastFallbackRefresh -> lastFallbackClientRefresh
lastFallbackClientsRefreshAttempt -> lastFallbackClientRefreshAttempt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lastFallbackClientsAttempt
time is the moment provider try to query srv hosts. It may fail during query process, if the query succeed, lastFallbackRefresh
time will be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use them to ensure two things:
- Not perform the lookup too frequently, if we see the last attempt is within a certain time period (30s), we will not perform a new lookup
- To ensure discovered clients could be refreshed periodically (every 1 hour), if the last success time exceeds 1 hour, we perform a new lookup
} | ||
|
||
func (manager *ConfigurationClientManager) GetClients() []*ConfigurationClientWrapper { | ||
if manager.ReplicaDiscoveryEnabled != true { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!manager.ReplicaDiscoveryEnabled
dd527b4
to
93775e6
Compare
Endpoint *string `json:"endpoint,omitempty"` | ||
ConnectionStringReference *string `json:"connectionStringReference,omitempty"` | ||
// +kubebuilder:default=true | ||
ReplicaDiscovery bool `json:"replicaDiscovery,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put it right after the Endpoint field
@@ -198,18 +199,28 @@ func (reconciler *AzureAppConfigurationProviderReconciler) Reconcile(ctx context | |||
} | |||
} | |||
|
|||
if reconciler.ProvidersReconcileState[req.NamespacedName].ClientManager == nil || | |||
reconciler.ProvidersReconcileState[req.NamespacedName].Generation != provider.Generation { | |||
manager, err := loader.NewConfigurationClientManager(ctx, *provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manager -> clientManager
/* Create ConfigurationSettingLoader to get the key-value settings from Azure AppConfiguration. */ | ||
configProvider, err := loader.NewConfigurationSettingLoader(ctx, *provider, nil) | ||
manager := reconciler.ProvidersReconcileState[req.NamespacedName].ClientManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this line before line 202
In line 202, just judge whether manager == nil
if reconciler.Retriever == nil { | ||
retriever = configProvider | ||
} else { | ||
if reconciler.Retriever != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary change
@@ -180,6 +181,7 @@ var _ = Describe("AppConfiguationProvider controller", func() { | |||
}, | |||
} | |||
|
|||
// mockConfigClientManager.EXPECT().ExecuteFailoverPolicy(gomock.Any(), gomock.Any(), gomock.Any()).Return(nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useless
mockResolveSecretReference *MockResolveSecretReference | ||
mockCtrl *gomock.Controller | ||
mockCongiurationClientManager *MockClientManager | ||
EndpointName string = "https://fake-endpoint" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All variables are using camel-style, please update it to be consistent
@@ -39,7 +35,7 @@ import ( | |||
type ConfigurationSettingLoader struct { | |||
acpv1.AzureAppConfigurationProvider | |||
getSettingsFunc GetSettingsFunc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems getSettingsFunc
need to be Pascal style, since all struct fields are Pascal Style
|
||
type ConfigurationClientManager struct { | ||
ReplicaDiscoveryEnabled bool | ||
SpecifiedClient *azappconfig.Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about call it StaticClientWrappers
and make it to be an array []*ConfigurationClientWrapper
? To potentially support customer can pass more than one static clients.
DynamicClients []*ConfigurationClientWrapper | ||
ValidDomain string | ||
Endpoint string | ||
ClientsFallbackStatus map[string]*ConfigurationClientWrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need this? Could use DynamicClients and StaticClients directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we don't need to update dynamicClients fallback status when autofailover enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying we don't need to update fallback of dynamicClients, I mean why we don't update it in DynamicClientwrappers
directly, not need to use another object which is only for managing the fallbacks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
type ConfigurationClientManager struct { | ||
ReplicaDiscoveryEnabled bool | ||
SpecifiedClient *azappconfig.Client | ||
DynamicClients []*ConfigurationClientWrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DynamicClientWrappers
GetClients() []*ConfigurationClientWrapper | ||
RefreshClients() | ||
UpdateClientBackoffStatus(endpoint string, successful bool) | ||
ExecuteFailoverPolicy(ctx context.Context, filters []acpv1.Selector, getSettingsFunc GetSettingsFunc) ([]azappconfig.Setting, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect the clientManager has the method ExecuteFailoverPolicy()
, the clientManger just take the responsibility for managing the clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, will move it to configLoader.
5b94884
to
da91ced
Compare
|
||
func parseConnectionString(connectionString string, token string) (string, error) { | ||
if connectionString == "" { | ||
return "", NewArgumentError("connectionString", fmt.Errorf("connectionString cannot be empty")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not return as ArugmentError in this method, should handle all the errors as runtime error.
|
||
func Jitter(duration time.Duration) time.Duration { | ||
if JitterRatio < 0 || JitterRatio > 1 { | ||
panic("jitterRatio must be between 0 and 1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need to use panic here?
currentTime.After(manager.lastFallbackClientAttempt.Time.Add(MinimalClientRefreshInterval)) { | ||
manager.lastFallbackClientAttempt = currentTime | ||
url, _ := url.Parse(manager.endpoint) | ||
_ = manager.DiscoverFallbackClients(url.Host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not ignore the error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to log the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DiscoverFallbackClients may return error in line 188, we should not ignore this error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging warning is fine in line 174
if _, ok := err.(*net.DNSError); ok { | ||
break | ||
} else { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not return nil here, must be an empty array
_, altRecords, err := net.LookupSRV(currentAlt, TCP, originHost) | ||
if err != nil { | ||
// If the host does not have SRV records => no more replicas | ||
if _, ok := err.(*net.DNSError); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what situation the error is not *net.DNSError
? We should differentiate the not existing
error from other errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DNSError has IsNotFound
field, we should use that, when IsNotFound
is true, we should not treat it as error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may return net error or something else. When there's no more alt records it will return dns error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the error is net.DNSError
and IsNotFound
is true, just break the loop. Throw for any other errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there's no alt recoreds, IsNotFound
filed in net.DNSError
is false.
results := make([]string, 0) | ||
|
||
_, originRecords, err := net.LookupSRV(Origin, TCP, host) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error handling should same as line 231
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean check the error's type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes , judge type net.DNSError
and IsNotFound
return results, err | ||
} | ||
|
||
originHost := strings.TrimSuffix(originRecords[0].Target, ".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The originHost
should also be added to the return results.
|
||
newDynamicClients := make([]*ConfigurationClientWrapper, 0) | ||
for _, host := range srvTargetHosts { | ||
if isValidEndpoint(host, manager.validDomain) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should judge whether the host is already in the staticClients, should not add it to dynamicClients if it is already a static client
|
||
// Failed to execute failover policy | ||
csl.ClientManager.RefreshClients() | ||
return nil, fmt.Errorf("failed to execute failover policy: all clients failed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all app configuration clients failed to get settings
Endpoint *string `json:"endpoint,omitempty"` | ||
Endpoint *string `json:"endpoint,omitempty"` | ||
// +kubebuilder:default=true | ||
ReplicaDiscovery bool `json:"replicaDiscovery,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to ReplicaDiscoveryEnabled
to keep consistent with other provdivers
} | ||
} | ||
|
||
originHost := strings.TrimSuffix(originRecords[0].Target, ".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't assume the originRecords has at least one item in it. we should judge its length, if length is 0
, return empty result
func QuerySrvTargetHost(host string) ([]string, error) { | ||
results := make([]string, 0) | ||
|
||
_, originRecords, err := net.LookupSRV(Origin, TCP, host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use this method LookupSRV(ctx context.Context, service, proto, name string) (string, []*SRV, error)
passing the context, which means all the calling chain should add context in the parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reset to request changes
@@ -30,7 +30,9 @@ import ( | |||
type AzureAppConfigurationProviderSpec struct { | |||
// The endpoint url of AppConfiguration which should sync the configuration key-values from. | |||
// +kubebuilder:validation:Format=uri | |||
Endpoint *string `json:"endpoint,omitempty"` | |||
Endpoint *string `json:"endpoint,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's kubebuilder
notation between lines which influences go fmt
result and the indention here.
@@ -37,6 +37,9 @@ spec: | |||
description: AzureAppConfigurationProviderSpec defines the desired state | |||
of AzureAppConfigurationProvider | |||
properties: | |||
ReplicaDiscoveryEnabled: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this customer-facting? should be camel case
Endpoint *string `json:"endpoint,omitempty"` | ||
Endpoint *string `json:"endpoint,omitempty"` | ||
// +kubebuilder:default=true | ||
ReplicaDiscoveryEnabled bool `json:"ReplicaDiscoveryEnabled,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json:"replicaDiscoveryEnabled,omitempty"
, it should be camel casing
//go:generate mockgen -destination=mocks/mock_configuration_client_manager.go -package mocks . ClientManager | ||
|
||
type ConfigurationClientManager struct { | ||
ReplicaDiscoveryEnabledEnabled bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why two Enabled in name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
ReplicaDiscoveryEnabledEnabled bool | ||
StaticClientWrappers []*ConfigurationClientWrapper | ||
DynamicClientWrappers []*ConfigurationClientWrapper | ||
validDomain string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the guidance of naming? inconsistent here: some fields are camel case, others are pascal case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type starting with an uppercase letter are exported to other packages. Those starting with a lowercase letter can be used only inside the package.
srvTargetHosts, err := QuerySrvTargetHost(ctx, host) | ||
if err != nil { | ||
klog.Warningf("Fail to build fall back clients %s", err.Error()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not return or throw in case of error? It continues to execute below codes when something wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should return, updated.
return connectionString[startIndex+len(parseToken) : endIndex], nil | ||
} | ||
|
||
func verfityEndpointFromConnectionString(endpoint string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verfity -> verify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method seems not being used.
* failover support * modify client manager * resolve conflicts * add net error to failoverable error * resolve comments * resove comments * fix nil pointer * resolve comments * add failover test * update getclients func * resolve comments * resolve comments * upgrage go version * resolve comments * resolve comments * fix typo * use context with timeout
No description provided.