Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA-209: Add CVOConfiguration controller #1163

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

DavidHurta
Copy link
Contributor

@DavidHurta DavidHurta commented Feb 24, 2025

The controller has an empty reconciliation logic as of now.
The logic will be implemented in a follow-up PR. The goal of this PR is to introduce a new CVO controller, which can optionally (depending on the state of the CVOConfiguration feature gate) create a new informer.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 24, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Feb 24, 2025

@DavidHurta: This pull request references OTA-209 which is a valid jira issue.

In response to this:

The controller has an empty reconciliation logic as of now.
The logic will be implemented in a follow-up OR. The goal of this PR is to introduce a new CVO controller, which can optionally (depending on the state of the CVOConfiguration feature gate) create a new informer.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Feb 24, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DavidHurta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 24, 2025
@DavidHurta
Copy link
Contributor Author

The PR contains the manifests as of now as well. However, the manifests will be merged in #1161. Right now the PR contains them for testing purposes. I will drop the commits once the PR merges (or I suppose that the GitHub will start to ignore the commits).

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Feb 24, 2025

@DavidHurta: This pull request references OTA-209 which is a valid jira issue.

In response to this:

The controller has an empty reconciliation logic as of now.
The logic will be implemented in a follow-up PR. The goal of this PR is to introduce a new CVO controller, which can optionally (depending on the state of the CVOConfiguration feature gate) create a new informer.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@DavidHurta
Copy link
Contributor Author

DavidHurta commented Feb 24, 2025

Testing on a launch 4.19,https://github.com/openshift/cluster-version-operator/pull/1161 aws cluster via a local CVO.

Default feature set:

Everything as expected.

No errors:

$ grep -iE "^E.*$" tmp.txt

Manifests are excluded, go routine is not run:

$ grep -iE "(clusterversionoperator|configuration.go)" tmp.txt 
I0224 16:18:27.051584  166205 start.go:20] ClusterVersionOperator v1.0.0-1331-ga002d379
I0224 16:18:28.288215  166205 payload.go:208] excluding Filename: "0000_00_cluster-version-operator_01_clusterversionoperators-CustomNoUpgrade.crd.yaml" Group: "apiextensions.k8s.io" Kind: "CustomResourceDefinition" Name: "clusterversionoperators.operator.openshift.io": "Default" is required, and release.openshift.io/feature-set=CustomNoUpgrade
I0224 16:18:28.288459  166205 payload.go:208] excluding Filename: "0000_00_cluster-version-operator_01_clusterversionoperators-DevPreviewNoUpgrade.crd.yaml" Group: "apiextensions.k8s.io" Kind: "CustomResourceDefinition" Name: "clusterversionoperators.operator.openshift.io": "Default" is required, and release.openshift.io/feature-set=DevPreviewNoUpgrade
I0224 16:18:28.294803  166205 payload.go:208] excluding Filename: "0000_00_cluster-version-operator_02_configuration-DevPreviewNoUpgrade.yaml" Group: "operator.openshift.io" Kind: "ClusterVersionOperator" Name: "cluster": "Default" is required, and release.openshift.io/feature-set=DevPreviewNoUpgrade
I0224 16:18:28.889835  166205 cvo.go:431] Starting ClusterVersionOperator with minimum reconcile period 3m46.499327831s
I0224 16:18:28.889876  166205 cvo.go:481] The ClusterVersionOperatorConfiguration feature gate is disabled or HyperShift is detected; the configuration sync routine will not run.
I0224 16:18:28.892714  166205 payload.go:208] excluding Filename: "0000_00_cluster-version-operator_01_clusterversionoperators-CustomNoUpgrade.crd.yaml" Group: "apiextensions.k8s.io" Kind: "CustomResourceDefinition" Name: "clusterversionoperators.operator.openshift.io": "Default" is required, and release.openshift.io/feature-set=CustomNoUpgrade
I0224 16:18:28.892929  166205 payload.go:208] excluding Filename: "0000_00_cluster-version-operator_01_clusterversionoperators-DevPreviewNoUpgrade.crd.yaml" Group: "apiextensions.k8s.io" Kind: "CustomResourceDefinition" Name: "clusterversionoperators.operator.openshift.io": "Default" is required, and release.openshift.io/feature-set=DevPreviewNoUpgrade
I0224 16:18:28.899985  166205 payload.go:208] excluding Filename: "0000_00_cluster-version-operator_02_configuration-DevPreviewNoUpgrade.yaml" Group: "operator.openshift.io" Kind: "ClusterVersionOperator" Name: "cluster": "Default" is required, and release.openshift.io/feature-set=DevPreviewNoUpgrade
I0224 16:24:51.593983  166205 cvo.go:557] Shutting down ClusterVersionOperator
I0224 16:24:51.725957  166205 start.go:25] Graceful shutdown complete for ClusterVersionOperator v1.0.0-1331-ga002d379.

DevPreviewNoUpgrade feature set:

Errors are detected:

$ grep -iE "^E.*$" tmp-after-featureset.txt 
E0224 16:25:07.387929  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
E0224 16:25:08.612690  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
E0224 16:25:11.104165  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
E0224 16:25:11.151110  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"dnsnameresolvers.network.openshift.io\" (762 of 928): CustomResourceDefinition dnsnameresolvers.network.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:25:14.902256  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
E0224 16:25:17.768946  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"etcdbackups.operator.openshift.io\" (87 of 928): CustomResourceDefinition etcdbackups.operator.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:25:23.919275  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
E0224 16:25:28.895993  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"backups.config.openshift.io\" (53 of 928): CustomResourceDefinition backups.config.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:25:39.398828  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"clusterimagepolicies.config.openshift.io\" (54 of 928): CustomResourceDefinition clusterimagepolicies.config.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:25:40.333321  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
E0224 16:25:49.772063  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"clustermonitoring.config.openshift.io\" (55 of 928): CustomResourceDefinition clustermonitoring.config.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:26:01.212247  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"imagepolicies.config.openshift.io\" (63 of 928): CustomResourceDefinition imagepolicies.config.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:26:20.157770  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"clusterversionoperators.operator.openshift.io\" (5 of 928): CustomResourceDefinition clusterversionoperators.operator.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:26:23.358257  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"machineconfignodes.machineconfiguration.openshift.io\" (778 of 928): CustomResourceDefinition machineconfignodes.machineconfiguration.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:26:34.337435  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"machineosbuilds.machineconfiguration.openshift.io\" (782 of 928): CustomResourceDefinition machineosbuilds.machineconfiguration.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:26:44.729480  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"machineosconfigs.machineconfiguration.openshift.io\" (783 of 928): CustomResourceDefinition machineosconfigs.machineconfiguration.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:26:55.343459  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"pinnedimagesets.machineconfiguration.openshift.io\" (784 of 928): CustomResourceDefinition pinnedimagesets.machineconfiguration.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
E0224 16:31:15.784552  166921 task.go:122] "Unhandled Error" err="error running apply for deployment \"openshift-cluster-version/cluster-version-operator\" (9 of 928): deployment openshift-cluster-version/cluster-version-operator is not available MinimumReplicasUnavailable (Deployment does not have minimum availability.) or progressing ProgressDeadlineExceeded (ReplicaSet \"cluster-version-operator-976c6c48f\" has timed out progressing.)" logger="UnhandledError"
E0224 16:31:41.645023  166921 sync_worker.go:686] "Unhandled Error" err="unable to synchronize image (waiting 24.512459908s): deployment openshift-cluster-version/cluster-version-operator is not available MinimumReplicasUnavailable (Deployment does not have minimum availability.) or progressing ProgressDeadlineExceeded (ReplicaSet \"cluster-version-operator-976c6c48f\" has timed out progressing.)" logger="UnhandledError"
E0224 16:32:33.051370  166921 task.go:122] "Unhandled Error" err="error running apply for configmap \"openshift-cluster-machine-approver/kube-rbac-proxy\" (417 of 928): Get \"https://api.ci-ln-ifnpj1b-76ef8.aws-2.ci.openshift.org:6443/api/v1/namespaces/openshift-cluster-machine-approver/configmaps/kube-rbac-proxy\": context canceled" logger="UnhandledError"
E0224 16:32:33.051412  166921 task.go:122] "Unhandled Error" err="error running apply for clusterrole \"cluster-node-tuning-operator\" (466 of 928): Get \"https://api.ci-ln-ifnpj1b-76ef8.aws-2.ci.openshift.org:6443/apis/rbac.authorization.k8s.io/v1/clusterroles/cluster-node-tuning-operator\": context canceled" logger="UnhandledError"

Most of the errors are expected. Established errors logs disappear after a while (for more info, see link - not caused by the PR). Some of the errors can be fixed as part of the PR. Errors such as the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" will disappear as well after a while (after the manifests get applied). However, we could check and wait for the expected resource ClusterVersionOperator to be created instead of logging errors (for example, by using a discovery client). Feedback/opinion is welcomed.

Otherwise, the controller is running as expected. Syncs are run every resyncPeriod or on manual resource updates.

$ grep -iE "(clusterversionoperator|configuration.go)" tmp-after-featureset.txt
I0224 16:25:05.410931  166921 start.go:20] ClusterVersionOperator v1.0.0-1331-ga002d379
I0224 16:25:06.559967  166921 payload.go:208] excluding Filename: "0000_00_cluster-version-operator_01_clusterversionoperators-CustomNoUpgrade.crd.yaml" Group: "apiextensions.k8s.io" Kind: "CustomResourceDefinition" Name: "clusterversionoperators.operator.openshift.io": "DevPreviewNoUpgrade" is required, and release.openshift.io/feature-set=CustomNoUpgrade
I0224 16:25:07.180586  166921 cvo.go:431] Starting ClusterVersionOperator with minimum reconcile period 3m16.099679264s
I0224 16:25:07.180812  166921 reflector.go:313] Starting reflector *v1alpha1.ClusterVersionOperator (2m0s) from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
I0224 16:25:07.180989  166921 reflector.go:349] Listing and watching *v1alpha1.ClusterVersionOperator from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
I0224 16:25:07.183832  166921 payload.go:208] excluding Filename: "0000_00_cluster-version-operator_01_clusterversionoperators-CustomNoUpgrade.crd.yaml" Group: "apiextensions.k8s.io" Kind: "CustomResourceDefinition" Name: "clusterversionoperators.operator.openshift.io": "DevPreviewNoUpgrade" is required, and release.openshift.io/feature-set=CustomNoUpgrade
W0224 16:25:07.387845  166921 reflector.go:569] github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)
E0224 16:25:07.387929  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
I0224 16:25:08.493618  166921 reflector.go:349] Listing and watching *v1alpha1.ClusterVersionOperator from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
W0224 16:25:08.612624  166921 reflector.go:569] github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)
E0224 16:25:08.612690  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
I0224 16:25:10.985044  166921 reflector.go:349] Listing and watching *v1alpha1.ClusterVersionOperator from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
W0224 16:25:11.104110  166921 reflector.go:569] github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)
E0224 16:25:11.104165  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
I0224 16:25:14.783817  166921 reflector.go:349] Listing and watching *v1alpha1.ClusterVersionOperator from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
W0224 16:25:14.902202  166921 reflector.go:569] github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)
E0224 16:25:14.902256  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
I0224 16:25:23.800231  166921 reflector.go:349] Listing and watching *v1alpha1.ClusterVersionOperator from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
W0224 16:25:23.919193  166921 reflector.go:569] github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)
E0224 16:25:23.919275  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
I0224 16:25:40.210983  166921 reflector.go:349] Listing and watching *v1alpha1.ClusterVersionOperator from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
W0224 16:25:40.333258  166921 reflector.go:569] github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)
E0224 16:25:40.333321  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"
I0224 16:26:19.908557  166921 sync_worker.go:1036] Running sync for customresourcedefinition "clusterversionoperators.operator.openshift.io" (5 of 928)
I0224 16:26:20.029257  166921 apiext.go:19] CRD clusterversionoperators.operator.openshift.io not found, creating
E0224 16:26:20.157770  166921 task.go:122] "Unhandled Error" err="error running apply for customresourcedefinition \"clusterversionoperators.operator.openshift.io\" (5 of 928): CustomResourceDefinition clusterversionoperators.operator.openshift.io does not declare an Established status condition: []" logger="UnhandledError"
I0224 16:26:30.281986  166921 sync_worker.go:1051] Done syncing for customresourcedefinition "clusterversionoperators.operator.openshift.io" (5 of 928)
I0224 16:26:30.646520  166921 reflector.go:349] Listing and watching *v1alpha1.ClusterVersionOperator from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
I0224 16:26:30.807908  166921 reflector.go:376] Caches populated for *v1alpha1.ClusterVersionOperator from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
I0224 16:26:30.810584  166921 sync_worker.go:1036] Running sync for clusterversionoperator "cluster" (7 of 928)
I0224 16:26:32.279377  166921 generic.go:67] ClusterVersionOperator /cluster not found, creating
I0224 16:26:32.403578  166921 sync_worker.go:1051] Done syncing for clusterversionoperator "cluster" (7 of 928)
I0224 16:26:32.403644  166921 configuration.go:93] Started syncing CVO configuration "ClusterVersionOperator/cluster"
I0224 16:26:32.403680  166921 configuration.go:109] ClusterVersionOperator configuration has been synced
I0224 16:26:32.403691  166921 configuration.go:95] Finished syncing CVO configuration (51.155µs)
I0224 16:28:30.808617  166921 configuration.go:93] Started syncing CVO configuration "ClusterVersionOperator/cluster"
I0224 16:28:30.808644  166921 configuration.go:109] ClusterVersionOperator configuration has been synced
I0224 16:28:30.808654  166921 configuration.go:95] Finished syncing CVO configuration (45.365µs)
I0224 16:30:30.808981  166921 configuration.go:93] Started syncing CVO configuration "ClusterVersionOperator/cluster"
I0224 16:30:30.808993  166921 configuration.go:109] ClusterVersionOperator configuration has been synced
I0224 16:30:30.809000  166921 configuration.go:95] Finished syncing CVO configuration (22.642µs)
I0224 16:30:46.061720  166921 configuration.go:93] Started syncing CVO configuration "ClusterVersionOperator/cluster"
I0224 16:30:46.061742  166921 configuration.go:109] ClusterVersionOperator configuration has been synced
I0224 16:30:46.061750  166921 configuration.go:95] Finished syncing CVO configuration (36.529µs)
I0224 16:31:15.073419  166921 sync_worker.go:1036] Running sync for customresourcedefinition "clusterversionoperators.operator.openshift.io" (5 of 928)
I0224 16:31:15.191280  166921 sync_worker.go:1051] Done syncing for customresourcedefinition "clusterversionoperators.operator.openshift.io" (5 of 928)
I0224 16:31:15.314868  166921 sync_worker.go:1036] Running sync for clusterversionoperator "cluster" (7 of 928)
I0224 16:31:15.434243  166921 sync_worker.go:1051] Done syncing for clusterversionoperator "cluster" (7 of 928)
I0224 16:31:27.423058  166921 configuration.go:93] Started syncing CVO configuration "ClusterVersionOperator/cluster"
I0224 16:31:27.423094  166921 configuration.go:109] ClusterVersionOperator configuration has been synced
I0224 16:31:27.423101  166921 configuration.go:95] Finished syncing CVO configuration (47.689µs)
I0224 16:32:30.809925  166921 configuration.go:93] Started syncing CVO configuration "ClusterVersionOperator/cluster"
I0224 16:32:30.809939  166921 configuration.go:109] ClusterVersionOperator configuration has been synced
I0224 16:32:30.809946  166921 configuration.go:95] Finished syncing CVO configuration (25.197µs)
I0224 16:32:33.051360  166921 cvo.go:528] Collected cvo configuration goroutine.
I0224 16:32:33.051482  166921 reflector.go:319] Stopping reflector *v1alpha1.ClusterVersionOperator (2m0s) from github.com/openshift/client-go/operator/informers/externalversions/factory.go:125
I0224 16:32:33.285635  166921 cvo.go:557] Shutting down ClusterVersionOperator
I0224 16:32:33.404721  166921 start.go:25] Graceful shutdown complete for ClusterVersionOperator v1.0.0-1331-ga002d379.

@DavidHurta
Copy link
Contributor Author

/retest-required

Make the CVO aware of a new feature gate called
ClusterVersionOperatorConfiguration, which was introduced in [1].

[1]: openshift/api#2044
The controller has an empty reconcile logic as of now.
The logic will be implemented later. The goal of this commit is to introduce
a new CVO controller, which can optionally (depending on the state of the
CVOConfiguration feature gate) create a new informer.

The purpose of the `Start` method is to make the creation of a ClusterVersionOperator informer
optional, and to make the creation possible later in the CVO logic by restarting the operator
informer factory. To decide whether to create the informer, feature gates must be known,
which happens normally later in the run when all the other informer factories are already
created and inaccessible to the main CVO controller.

Related enhancement: [1]

[1]: https://github.com/openshift/enhancements/blob/2890cccf20ebcb94fce901f7afb170ca680aa2d9/enhancements/update/cvo-log-level-api.md
HyperShift will not create the ClusterVersionOperator CRD and CR.
Thus, the relevant logic does not have to be running. Later, in HyperShift, we will
configure the CVO to be configured via a configuration file instead.
@DavidHurta DavidHurta force-pushed the cvo-configuration-controller branch from a002d37 to d09c388 Compare February 26, 2025 00:00
@DavidHurta
Copy link
Contributor Author

/test e2e-agnostic-usc-devpreview

@DavidHurta
Copy link
Contributor Author

/test e2e-agnostic-ovn

Comment on lines +89 to +91
if !config.started {
panic("ClusterVersionOperatorConfiguration instance was not properly started before its synchronization.")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we recently had a case in USC where we used https://pkg.go.dev/sync#Once for a similar kind of initialization - first call to sync performs an initialization through sync.Once and then you do not need to do a check like this. Could we use a similar pattern here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll have a look.

@@ -220,6 +229,7 @@ func (o *Options) run(ctx context.Context, controllerCtx *Context, lock resource
controllerCtx.OpenshiftConfigInformerFactory.Start(informersDone)
controllerCtx.OpenshiftConfigManagedInformerFactory.Start(informersDone)
controllerCtx.InformerFactory.Start(informersDone)
controllerCtx.OperatorInformerFactory.Start(informersDone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work on cluster without the CRD? 🤔

Copy link
Contributor Author

@DavidHurta DavidHurta Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start does not block and does not return anything. So the operation should be a no-op in such a cluster (as the code does not create any informers in such clusters).

If informers are created but no CRD for the informer is present on the cluster:

  • Calling WaitForCacheSync on the factory will result in a permanently blocking operation as the caches are not able to sync.
  • Informers will start logging errors such as (see my previous comment)
E0224 16:25:40.333321  166921 reflector.go:166] "Unhandled Error" err="github.com/openshift/client-go/operator/informers/externalversions/factory.go:125: Failed to watch *v1alpha1.ClusterVersionOperator: failed to list *v1alpha1.ClusterVersionOperator: the server could not find the requested resource (get clusterversionoperators.operator.openshift.io)" logger="UnhandledError"

A good question as I am not sure how the code functions in case the operator group itself does not exist yet (during an install).

Copy link
Contributor

openshift-ci bot commented Feb 26, 2025

@DavidHurta: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-agnostic-ovn d09c388 link true /test e2e-agnostic-ovn
ci/prow/okd-scos-e2e-aws-ovn d09c388 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants