-
Notifications
You must be signed in to change notification settings - Fork 72
OCPBUGS-58259: Reduce Frequency of Update Requests for Copied CSVs (#3597) [release-4.19] #1030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-58259: Reduce Frequency of Update Requests for Copied CSVs (#3597) [release-4.19] #1030
Conversation
…(#3597) * (bugfix): reduce frequency of update requests for CSVs by adding annotations to copied CSVs that are populated with hashes of the non-status fields and the status fields. This seems to be how this was intended to work, but was not actually working this way because the annotations never actually existed on the copied CSV. This resulted in a hot loop of update requests being made on all copied CSVs. Signed-off-by: everettraven <[email protected]> * update unit tests Signed-off-by: everettraven <[email protected]> * updates to test so far Signed-off-by: everettraven <[email protected]> * Small changes Signed-off-by: Brett Tofel <[email protected]> * Add metadata drift guard to copyToNamespace Since we switched to a PartialObjectMetadata cache to save memory, we lost visibility into copied CSV spec and status fields, and the reintroduced nonStatusCopyHash/statusCopyHash annotations only partially solved the problem. Manual edits to a copied CSV could still go undetected, causing drift without reconciliation. This commit adds two new annotations: olm.operatorframework.io/observedGeneration and olm.operatorframework.io/observedResourceVersion. It implements a mechanism to guard agains metadata drift at the top of the existing-copy path in copyToNamespace. If a stored observedGeneration or observedResourceVersion no longer matches the live object, the operator now: • Updates the spec and hash annotations • Updates the status subresource • Records the new generation and resourceVersion in the guard annotations Because the guard only fires when its annotations are already present, all existing unit tests pass unchanged. We preserve the memory benefits of the metadata‐only informer, avoid extra GETs, and eliminate unnecessary API churn. Future work may explore a WithTransform informer to regain full object visibility with minimal memory impact. Signed-off-by: Brett Tofel <[email protected]> * Tests for metadata guard Verifies that exactly three updates (spec, status, guard) are issued when the observedGeneration doesn’t match. Signed-off-by: Brett Tofel <[email protected]> * Persist observed annotations on all status updates Signed-off-by: Brett Tofel <[email protected]> * GCI the file Signed-off-by: Brett Tofel <[email protected]> * Use TransformFunc Unit tests not updated Signed-off-by: Todd Short <[email protected]> * Update operatorgroup tests to compile Signed-off-by: Todd Short <[email protected]> * Restore operatorgroup_test from master Remove metadatalister Signed-off-by: Todd Short <[email protected]> * Remove more PartialObjectMetadata Signed-off-by: Todd Short <[email protected]> * Remove hashes from operator_test Signed-off-by: Todd Short <[email protected]> * Fix error messages for static-analysis Signed-off-by: Todd Short <[email protected]> * Update test annotations and test client Signed-off-by: Todd Short <[email protected]> * Rename pruning to listerwatcher Signed-off-by: Todd Short <[email protected]> * Set resync to 6h Signed-off-by: Todd Short <[email protected]> * Add CSV copy revert syncer Signed-off-by: Todd Short <[email protected]> * Log tweaks Signed-off-by: Todd Short <[email protected]> * Consolidate revert and gc syncers Signed-off-by: Todd Short <[email protected]> * Add logging and reduce the amount of metadata in the TransformFunc Signed-off-by: Todd Short <[email protected]> * Handle the copy CSV revert via a requeue of the primary CSV Signed-off-by: Todd Short <[email protected]> * Revert "Set resync to 6h" This reverts commit 855f940a2199bd4071c51f14ef44728550bf13cf. Signed-off-by: Todd Short <[email protected]> * Add delete handler for copied csv Signed-off-by: Todd Short <[email protected]> * Revert whitespace change Signed-off-by: Todd Short <[email protected]> * Rename function, fix comment Signed-off-by: Todd Short <[email protected]> --------- Signed-off-by: everettraven <[email protected]> Signed-off-by: Brett Tofel <[email protected]> Signed-off-by: Todd Short <[email protected]> Co-authored-by: everettraven <[email protected]> Co-authored-by: Brett Tofel <[email protected]> Upstream-repository: operator-lifecycle-manager Upstream-commit: d055f28750cf62f966f566d36990fff5285c7a71 (cherry picked from commit bc111a9)
@tmshort: This pull request references Jira Issue OCPBUGS-58259, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@tmshort: This pull request references Jira Issue OCPBUGS-58259, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@tmshort: This pull request references Jira Issue OCPBUGS-58259, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: oceanc80, tmshort The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/label backport-risk-assessed |
/hold thanks |
pre-verification pass |
/label qe-approved |
@tmshort: This pull request references Jira Issue OCPBUGS-58259, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/test e2e-gcp-olm |
/retest-required |
1 similar comment
/retest-required |
/retest-required
However, this PR doesn't do anything with CatSrc's... |
/test e2e-gcp-olm |
/retest-required |
/test e2e-gcp-olm |
This keeps failing, and seems unrelated....
|
@tmshort: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
0e4e456
into
openshift:release-4.19
@tmshort: Jira Issue OCPBUGS-58259: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-58259 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/cherry-pick release-4.18 |
@tmshort: new pull request created: #1034 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[ART PR BUILD NOTIFIER] Distgit: operator-registry |
[ART PR BUILD NOTIFIER] Distgit: ose-operator-framework-tools |
[ART PR BUILD NOTIFIER] Distgit: operator-lifecycle-manager |
Fix included in accepted release 4.19.0-0.nightly-2025-07-09-044836 |
Use a TransformFunc to generate hashes of the copied CSVs to reduced storage needs, and to minimize API access.
The following is the complete unsquashed commit listing
by adding annotations to copied CSVs that are populated with hashes of the non-status fields and the status fields.
This seems to be how this was intended to work, but was not actually working this way because the annotations never actually existed on the copied CSV. This resulted in a hot loop of update requests being made on all copied CSVs.
update unit tests
updates to test so far
Small changes
Add metadata drift guard to copyToNamespace
Since we switched to a PartialObjectMetadata cache to save memory, we lost visibility into copied CSV spec and status fields, and the reintroduced nonStatusCopyHash/statusCopyHash annotations only partially solved the problem. Manual edits to a copied CSV could still go undetected, causing drift without reconciliation.
This commit adds two new annotations: olm.operatorframework.io/observedGeneration and olm.operatorframework.io/observedResourceVersion. It implements a mechanism to guard agains metadata drift at the top of the existing-copy path in copyToNamespace. If a stored observedGeneration or observedResourceVersion no longer matches the live object, the operator now:
Because the guard only fires when its annotations are already present, all existing unit tests pass unchanged. We preserve the memory benefits of the metadata‐only informer, avoid extra GETs, and eliminate unnecessary API churn.
Future work may explore a WithTransform informer to regain full object visibility with minimal memory impact.
Verifies that exactly three updates (spec, status, guard) are issued when the observedGeneration doesn’t match.
Persist observed annotations on all status updates
GCI the file
Use TransformFunc
Unit tests not updated
Update operatorgroup tests to compile
Restore operatorgroup_test from master
Remove metadatalister
Remove more PartialObjectMetadata
Remove hashes from operator_test
Fix error messages for static-analysis
Update test annotations and test client
Rename pruning to listerwatcher
Set resync to 6h
Add CSV copy revert syncer
Log tweaks
Consolidate revert and gc syncers
Add logging and reduce the amount of metadata in the TransformFunc
Handle the copy CSV revert via a requeue of the primary CSV
Revert "Set resync to 6h"
This reverts commit 855f940a2199bd4071c51f14ef44728550bf13cf.
Add delete handler for copied csv
Revert whitespace change
Rename function, fix comment
Upstream-repository: operator-lifecycle-manager
Upstream-commit: d055f28750cf62f966f566d36990fff5285c7a71 (cherry picked from commit bc111a9)