Skip to content

Conversation

@perdasilva
Copy link

@perdasilva perdasilva commented Dec 2, 2025

OLMv1 ClusterExtensionRevision API Review

⚠️ DO NOT MERGE
This PR is only to facilitate API review.

ClusterExtensionRevision API

ClusterExtensionRevision objects are created by the ClusterExtension controller for each install, upgrade, or reconfiguration operation. They are owned by a ClusterExtension. A ClusterExtensionRevision carries the objects of the revision (i.e. the resources of a package at a given version + any user supplied configuration). We don't expect users to create any clusterextensionrevision objects. We only expect users to interact with this API when something goes wrong, or, in the future, in approval flows (i.e. a revision is kept from rolling out until there's some manual intervention by an user).

Lifecycle

  • Active: actively reconciling
  • Archived: inert (only for historical / auditing purposes)

Revision Rollout Strategy

  1. The objects are grouped into phases.
  2. The objects of each phase are applied together and in no particular order.
  3. Rollout will only progress to the next phase once all object of the current phase pass their probes
  4. Probes are attached to kinds and currently come in two flavours:
    1. status condition probes: e.g. Available=True
    2. field equality: e.g. .status.replicas == .status.updatedReplicas
  5. The rollout is complete once all phases are complete

Object ownership is transferred from previous to subsequent revisions.

Once a revision completes, it sets all previous revisions to Archived. When Archived the revision will clean up any objects it still manages.

Collision Control

Phase objects can be be configured to error on existing, adopt orphan resources, or force adopt resources.

Bundle Rollout Strategy Definition

registry+v1 bundles will have their rollout strategy defined by OLM in both phases and probes. If we directly support the Helm format, OLM will likey also define the rollout strategy.

Currently the registry+v1 bundle strategy defined by sorting the bundle manifests into different phases: namespaces, policies, rbac, crds, etc. with default probes, e.g. CRD has Established=True condition, Deployment has Available=True, etc.

Known Upcoming Changes

  • Removal of the Paused lifecycle state from docs and code
  • Removal of the "Migrated" reason on the "Available" condition
  • Moving collisionProtection out of the object and up to the top level

Known Limitations

The objects are currently defined inline in the CRD. In the (not so distant) future we will shard the resources across some kind of container (e.g. ConfigMap or Secret).

Future Work (beyond GA)

  • Revision approval
  • Rollback strategies
  • Paused lifecycle
  • Bundle author provided rollout definition

Open Questions

  • Can we release this API TechPreview without resource sharding and calling out the known limitation. This would require subsequent (possibly breaking) API changes.
  • Should the rollout definition be exposed through the API or just be a clusterextensionrevision controller concern that is opaque to users. Or rather, should we add the probe definitions to the API, or, rather than expose phases just have a "bad of objects"?

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@coderabbitai
Copy link

coderabbitai bot commented Dec 2, 2025

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Excluded labels (none allowed) (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 2, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 2, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 2, 2025

Hello @perdasilva! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Dec 2, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 2, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign joelspeed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@everettraven
Copy link
Contributor

Following up here - this is in my queue, but I'm working through reading the associated RFE first to ensure I've got full context before reviewing.

I should have an initial review within the next day or so.

Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, a handful of comments.

It might be helpful to run the latest version of linter against this API to catch some of the smaller things our linter is good at catching :).

//
// Once a revision is set to "Archived", it cannot be un-archived.
//
// +kubebuilder:default="Active"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a required input instead of defaulting to Active?

What does it mean for multiple ClusterExtensionRevision objects for the same ClusterExtension to be Active?

Comment on lines +99 to +101
// ClusterExtensionRevisionLifecycleStatePaused / "Paused" disables reconciliation of the ClusterExtensionRevision.
// Object changes will not be reconciled. However, status updates will be propagated.
ClusterExtensionRevisionLifecycleStatePaused ClusterExtensionRevisionLifecycleState = "Paused"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used in the API as an allowed value?

// [RFC 1123]: https://tools.ietf.org/html/rfc1123
//
// +kubebuilder:validation:MaxLength=63
// +kubebuilder:validation:Pattern=`^[a-z]([-a-z0-9]*[a-z0-9])?$`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discourage the usage of the pattern marker because of the lack of helpful error messages that are returned to end-users (a regex string instead of a sentence explaining the constraints).

Instead, use a CEL expression that enforces this constraint and returns a helpful message like:

// +kubebuilder:validation:XValidation:rule=`!format.dns1123Label().validate(self).hasValue()`,message="the value must consist of only lowercase alphanumeric characters and hyphens, and must start with an alphabetic character and end with an alphanumeric character."

//
// +kubebuilder:validation:MaxLength=63
// +kubebuilder:validation:Pattern=`^[a-z]([-a-z0-9]*[a-z0-9])?$`
Name string `json:"name"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly mark the field as required.

Suggested change
Name string `json:"name"`
// +required
Name string `json:"name"`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the empty string a valid value?

Comment on lines +87 to +88
// +listType=map
// +listMapKey=name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the uniqueness constraint keyed on the phase name in the GoDoc.


// collisionProtection controls whether the operator can adopt and modify objects
// that already exist on the cluster.
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We normally try to follow a pattern where we explicitly introduce the allowed values as well with a sentence like:

Allowed values are Prevent, IfNoController, ...

before going into the When ... sections.

//
// +kubebuilder:default="Prevent"
// +kubebuilder:validation:Enum=Prevent;IfNoController;None
// +optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be helpful context, could you elaborate for me why you want this field to be optional?

// Use this setting with extreme caution as it may cause multiple controllers to fight over
// the same resource, resulting in increased load on the API server and etcd.
//
// +kubebuilder:default="Prevent"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you ever intend to modify the default behavior? Done this way, the defaulting logic becomes a part of your API contract.

//
// +kubebuilder:validation:EmbeddedResource
// +kubebuilder:pruning:PreserveUnknownFields
Object unstructured.Unstructured `json:"object"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong here, but IIRC most embedded resource types will use https://pkg.go.dev/k8s.io/kubernetes/pkg/runtime#RawExtension as the type there.

I don't have a strong opinion here though as I've not reviewed many APIs that are embedding another resource blob within them. If this is working, 👍

- jsonPath: .metadata.creationTimestamp
name: Age
type: date
name: v1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v1? Have you shipped this API already? Typically, APIs for experimental type features will start as v1alpha1. In OpenShift, it is expected that TP APIs are v1alphaN until ready to be promoted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants