-
Notifications
You must be signed in to change notification settings - Fork 574
[WIP] AGENT-1330: machineconfiguration/v1alpha1: add InternalReleaseImage #2510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[WIP] AGENT-1330: machineconfiguration/v1alpha1: add InternalReleaseImage #2510
Conversation
@andfasano: This pull request references AGENT-1330 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Hello @andfasano! Some important instructions when contributing to openshift/api: |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
9ea943a
to
9eca225
Compare
0d80867
to
7ea433e
Compare
/retest-required |
I'm currently experimenting with AI for API review, hopefully some of the content it is generating is helpful for you to improve your API The following code blocks were generated by Claude
I think the linting issues are actually not the response I'd like. Instead lets try and make the zero values not valid (required fields or For the comments, it has highlighted that you haven't explained what happens when the optional fields are omitted, though I'm not sure its suggestions are super helpful, please review and think about what you'd actually like to put in. If it has identified something where you can't think of a reason why it wouldn't be present, then maybe the field should actually be required Quizzing specifically about whether all validations were documented, some further output
|
7b60a33
to
ab6545c
Compare
/api-review |
cf83224
to
c2f669c
Compare
/test verify |
f330e2a
to
6b6941e
Compare
|
||
// MachineConfigNodeStatusInternalReleaseImage holds information about the current, desidered and discovered release bundles for the observed machine | ||
// config node. | ||
type MachineConfigNodeStatusInternalReleaseImage InternalReleaseImageStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Alpha MCN type is no longer deployed anywhere, even in techpreview. These probably should just be +openshift:enable:FeatureGate=NoRegistryClusterOperations
'ed in the v1 API for MCN.
(although I noticed that #2531 was opened. I assume we plan to do that there? I don't see the MCN fields yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, after a discussion with @pablintino I realized that the current one wasn't the right approach, for the same reasons you mentioned. So I opened a new PR just to try out a new approach, if it make more sense from a structural/type organization point of view (but the plan was to keep working here).
I think also this issue could be somehow simplified after the type split discussed before
// +kubebuilder:validation:MinItems=1 | ||
// +kubebuilder:validation:MaxItems=256 | ||
// +optional | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to understand the conditions a bit better: since this will show up in both the MCN object and the overall IRI object, would the actual conditions look the same?
More concretely, the MCN object would reflect each IRI status on each node (since it would have a list of status's) but we'd only define a singular IRI object with a list of releases. If say, we had 3 releases and 5 nodes, we end up with (number of nodes) * (number of releases defined for those nodes) = 15 lists of conditions reflected in the MCN, but only a singular list of conditions here. So would this just be reflecting if any of those nodes are available/progressing/failing? I'm having a hard time thinking of the user experience when I read this field.
If that's the case, I think I would lean doing two things here:
- separate the InternalReleaseImageStatus type into two types in MCN and IRI object:
a) per guidance of the API team, it's generally better to not inter-mingle object references
b) this would allow us to define different conditions per subtype, like we have in e.g. https://github.com/openshift/api/blob/master/machineconfiguration/v1/types_machineosbuild.go#L81-L90 - concretely define what we want to see in this field. There's a few options as I understand it: some vague overall state, per-release state (in that case, should the conditions be nested in the
AvailableReleases
subfield, so we can see what each release's status is?) or have this be more akin to what we do for PinnedImageSet and not have status conditions at all, and make it purely an MCN only thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention was to avoid duplicating the same code for representing both the singleton IRI status and each MCN sub-IRI statuses. From this point of view, the singleton IRI is an aggregator, since it monitors all the MCNs and reports a summarized view for the end user.
But if you think that it will more clear, especially from an API point of view, to have different structs, then it's fine for me.
Regarding point 2, I was planning to add a more detailed enum for the Conditions later (like the MCN StateProgress), once the high-level types organization will be consolidated - but if it helps I can start adding them right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one, I must agree with the hint that 15 conditions are, in general, unmanegable from the user point of view to the point of, would it be useful? Do we really need to report that fine-grained level of details in the conditions? I have never seen such of a report strategy in conditions, that are usually a bit higher level scoped. I do like the idea of re-using the availableReleases
field (or a new one if it doesn't fit) to report fine-grain details of the release and report a single, Ready
condition that the user can watch when he/she adds a new release and wants to wait for it to be available and propagated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if you think that it will more clear, especially from an API point of view, to have different structs, then it's fine for me.
While that's the general preference as I understand it, I think there's an argument here for either method, depending on how it would actually look.
When you say aggregator, what would I be expecting if, say, I have 2 releases, 1 installed for all nodes, and 1 failing for 2 nodes and waiting for pull for the other 3? Would i expect to see:
- 5 installed conditions, 2 failing conditions, 3 ready conditions (1 per status)
- 3 progressing conditions and 2 failing conditions (1 per node)
- 1 failing condition, 1 available condition, 1 ready condition (1 per type with a list of nodes in each state)
|
||
// InternalReleaseImageSpec defines the desired state of a InternalReleaseImage. | ||
type InternalReleaseImageSpec struct { | ||
// releases is a list of release bundle identifiers that the user wants to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(suggestion from claude): we should document the constraints (maximum # you can specify)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't sufficient to specify the +kubebuilder:validation:MaxItems
value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's the validation, it's asking for the godoc to explicitly state for the user what the maximum is, something like This field can contain between x and y release bundles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok
// +kubebuilder:validation:MinItems=1 | ||
// +kubebuilder:validation:MaxItems=256 | ||
// +optional | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one, I must agree with the hint that 15 conditions are, in general, unmanegable from the user point of view to the point of, would it be useful? Do we really need to report that fine-grained level of details in the conditions? I have never seen such of a report strategy in conditions, that are usually a bit higher level scoped. I do like the idea of re-using the availableReleases
field (or a new one if it doesn't fit) to report fine-grain details of the release and report a single, Ready
condition that the user can watch when he/she adds a new release and wants to wait for it to be available and propagated.
a9c080e
to
5bb2da8
Compare
/retest |
5bb2da8
to
ee91dd4
Compare
/retest |
@andfasano: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the structure makes more sense to me. A few questions/comments on the new MCN fields (and the previous question on the conditions)
// +optional | ||
IrreconcilableChanges []IrreconcilableChangeDiff `json:"irreconcilableChanges,omitempty"` | ||
// internalReleaseImage describes the status of the release payloads stored in the node. | ||
// +optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this field also needs the +openshift:enable:FeatureGate=NoRegistryClusterOperations
?
Also may be good to add a few more lines of description, something like:
When specified, an internalReleaseImage custom resource exists on the cluster, and the specified images will be made available on the control plane nodes. This field will reflect the actual on-disk state of those release images.
InternalReleaseImage *MachineConfigNodeStatusInternalReleaseImage `json:"internalReleaseImage,omitempty"` | ||
} | ||
|
||
// MachineConfigNodeStatusInternalReleaseImage holds information about the current, desidered and discovered release bundles for the observed machine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
desidered
-> desired
Also this reads to me like there would be three types: "current, desired and discovered". I think based on the actual fields, it's just "current" (installedReleases) and "discovered"(availableReleases)? The actual "desired" we have to specify via the IRI object?
Maybe it's worth it to have a third field? So the workflow would be:
- the user mounts the release image
- the user sees that currently the MCN object for that node has it under "discovered"
- the user updates the IRI object to add that release
- the MCD ack's that change to the IRI object, so now "discovered" and "desired" both have that field
- the MCD finishes setting up the registry and makes it available, thereby having it in "discovered" "desired" and "current"
I'm not sure if that's necessarily helpful. It's a bit weird since the inputs for "discovered" and "desired" and "current" are not all the same (a mounted disk, the user specified IRI object, the MCD's actions), so if we want to keep it simpler maybe we just have the two fields we have now and skip the "desired" part. WDYT @pablintino @JoelSpeed ?
// +listType=map | ||
// +listMapKey=type | ||
// +kubebuilder:validation:MinItems=1 | ||
// +kubebuilder:validation:MaxItems=256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question for #2510 (comment) kinda applies here as well, but maybe we're ok with just having a few general status's for the node to indicate if anything is failing. In that case, I think 256 is a bit overkill. Maybe we can shorten this and then similar to https://github.com/openshift/api/blob/master/machineconfiguration/v1/types_machineosbuild.go#L81-L90 have a list of what we'd expect.
// +listType=map | ||
// +listMapKey=name | ||
// +kubebuilder:validation:MaxItems=5 | ||
// +optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking out loud, maybe we should have this be required (if the IRI status exists) so at it would always reflect what the daemon is currently detecting? And if we don't detect anything we just have an empty list?
// +listMapKey=name | ||
// +kubebuilder:validation:MaxItems=5 | ||
// +optional | ||
AvailableReleases []InternalReleaseImageRef `json:"availableReleases,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have the split, what get's reflected here? If e.g. each control plane node doesn't currently have the same available (mounted) releases, would this reflect it as available if any control plane node is hosting that image? Or only when all control plane nodes have it mounted and reporting?
// +kubebuilder:validation:MinItems=1 | ||
// +kubebuilder:validation:MaxItems=5 | ||
// +optional | ||
InstalledReleases []InternalReleaseImageDetailedRef `json:"installedReleases,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above
This patch adds the new
InternalReleaseImage
CRD. See openshift/enhancements#1821 for additional details