Skip to content

Conversation

@sbueringer
Copy link
Member

@sbueringer sbueringer commented Oct 23, 2025

Signed-off-by: Stefan Büringer [email protected]

What this PR does / why we need it:

Today we only create BootstrapConfigs/InfraMachines but we never update them.
To be able to correctly update BootstrapConfigs/InfraMachines (e.g. during in-place updates)
we have to use SSA, e.g. for proper handling of co-ownership of fields.

Additionally we have to use two fieldManagers because we want to be able to:

  • sync labels & annotations continuously
  • only update the rest of BootstrapConfigs/InfraMachines when we trigger an in-place update
    This would not be possible with one fieldManager as the label & annotations sync would unset all other fields

This PR tackles this problem by:

  • creating new BootstrapConfigs/InfraMachines with SSA
  • migrating BootstrapConfigs/InfraMachines created with CAPI <= v1.11 to the new managedField structure

Current managedField structure (CAPI <= v1.11)

  • Machine
    • spec+labels+annotations (capi-kubeadmcontrolplane/capi-machineset:Apply)
  • BootstrapConfig/InfraMachine
    • labels+annotations (capi-kubeadmcontrolplane/capi-machineset:Apply)
    • spec (manager:Update)

New managedField structure (CAPI >= v1.12)

  • Machine (same as current)
    • spec+labels+annotations (capi-kubeadmcontrolplane/capi-machineset:Apply)
  • BootstrapConfig/InfraMachine
    • labels+annotations (capi-kubeadmcontrolplane-metadata/capi-machineset-metadata:Apply)
    • spec (capi-kubeadmcontrolplane/capi-machineset:Apply)

Machines will behave exactly the same, so there are no changes to Machine creation and no migration is necessary.

Everything below describes the new behavior with CAPI v1.12.

BoostrapConfig/InfraMachine creation

  • Apply BootstrapConfig/InfraMachine (manager: capi-kubeadmcontrolplane/capi-machineset:Apply)
  • Remove managedFields for labels+annotations
  • Resulting managedFields:
    • labels+annotations (orphaned)
    • spec (capi-kubeadmcontrolplane/capi-machineset:Apply)

Directly afterward syncMachines will be called

  • Apply BootstrapConfig/InfraMachine labels+annotations (capi-kubeadmcontrolplane-metadata/capi-machineset-metadata:Apply)
  • Resulting managedFields:
    • labels+annotations (capi-kubeadmcontrolplane-metadata/capi-machineset-metadata:Apply)
    • spec (capi-kubeadmcontrolplane/capi-machineset:Apply)

After this, BootstrapConfigs/InfraMachines have the desired managedField structure and are ready for continuous
syncMachine calls to sync labels+annotations and also for triggering in-place updates.

When we trigger in-place updates the following happens

  • Apply BootstrapConfig/InfraMachine with spec + in-progress/cloned-from annotations (capi-kubeadmcontrolplane/capi-machineset:Apply)
  • Resulting managedFields:
    • labels+annotations (capi-kubeadmcontrolplane-metadata/capi-machineset-metadata:Apply)
    • spec (capi-kubeadmcontrolplane/capi-machineset:Apply)
    • in-progress/cloned-from annotations (capi-kubeadmcontrolplane/capi-machineset:Apply)

Migration from managedFields v1.11 => v1.12

So now the only missing piece is how do we migrate objects created with CAPI <= v1.11 to the new managedField structure of CAPI >= v1.12.

  • Initial managedFields:
    • labels+annotations (capi-kubeadmcontrolplane/capi-machineset:Apply)
    • spec (manager:Update)
  • Migration logic:
    • Executed once if metadata.labels["cluster.x-k8s.io/cluster-name"] is owned by capi-kubeadmcontrolplane/capi-machineset:Apply
    • Delete all managedFields with: manager:Update, capi-kubeadmcontrolplane/capi-machineset:Apply (subresource == "")
  • Resulting managedFields:
    • labels+annotations (orphaned)
    • spec (orphaned)
  • managedFields after the next syncMachines:
    • labels+annotations (capi-kubeadmcontrolplane-metadata/capi-machineset-metadata:Apply)
    • spec (orphaned)
  • managedFields after next in-place update:
    • labels+annotations (capi-kubeadmcontrolplane-metadata/capi-machineset-metadata:Apply)
    • spec (capi-kubeadmcontrolplane/capi-machineset:Apply)
    • in-progress/cloned-from annotations (capi-kubeadmcontrolplane/capi-machineset:Apply)

So after the first in-place update the managedFields of CAPI <= v1.11 object will be identical to he managedFields of CAPI >= v1.12.0 objects.
Because the fields of CAPI <= v1.11 objects are orphaned before that the first in-place update won't be able to unset fields.

Accordingly for Clusters that have been created with CAPI <= v1.11 it will only be possible to unset fields during in-place updates after:

  • a regular rollout that replaces all Machines
  • an in-place update that updates all Machines

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Part of #12291

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area PR is missing an area label labels Oct 23, 2025
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 23, 2025
@sbueringer sbueringer added area/provider/control-plane-kubeadm Issues or PRs related to KCP area/machineset Issues or PRs related to machinesets labels Oct 23, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/needs-area PR is missing an area label label Oct 23, 2025
@sbueringer
Copy link
Member Author

/test pull-cluster-api-e2e-main

@sbueringer sbueringer changed the title [WIP] ✨ KCP/MS: Refactor BootstrapConfig/InfraMachine managedFields for in-place ✨ KCP/MS: Refactor BootstrapConfig/InfraMachine managedFields for in-place Oct 23, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 23, 2025
@sbueringer sbueringer force-pushed the pr-kcp-ms-managed-field-refactor branch from 6e0bb4f to 32f7d33 Compare October 23, 2025 18:39
@sbueringer
Copy link
Member Author

/test pull-cluster-api-e2e-main

@sbueringer sbueringer force-pushed the pr-kcp-ms-managed-field-refactor branch from 32f7d33 to 79c8a90 Compare October 24, 2025 09:39
@sbueringer
Copy link
Member Author

/test pull-cluster-api-e2e-main

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, the result of an amazing research before the implementation
A few nits from my side but sgtm

@sbueringer
Copy link
Member Author

/test pull-cluster-api-e2e-main

@sbueringer
Copy link
Member Author

@fabriziopandini Thx for the quick review, PTAL :)

@sbueringer sbueringer force-pushed the pr-kcp-ms-managed-field-refactor branch from eec306b to 21d33af Compare October 25, 2025 04:30
@sbueringer
Copy link
Member Author

/test pull-cluster-api-e2e-main

@fabriziopandini
Copy link
Member

We can eventually iterate in future for inline managedFields in tests.
(currently we have a mix of multiline strings and inline strings which is inconsistent. Also, if the content of managed field is not relevant from a test, we might consider to move the string to a cons somewhere else and call it out explicitly that the value is not relevant)

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 27, 2025
@fabriziopandini
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: b919c37355b934c9b113f896461f528ae1b26a14

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabriziopandini

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 27, 2025
@k8s-ci-robot k8s-ci-robot merged commit 6d490b2 into kubernetes-sigs:main Oct 27, 2025
19 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.12 milestone Oct 27, 2025
@sbueringer sbueringer deleted the pr-kcp-ms-managed-field-refactor branch October 27, 2025 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/machineset Issues or PRs related to machinesets area/provider/control-plane-kubeadm Issues or PRs related to KCP cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants