-
Notifications
You must be signed in to change notification settings - Fork 68
feat(23570): Add controller for workspace backup #1530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Allda The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
42dd45c to
dffd7e6
Compare
|
@Allda : Really appreciate you taking the time to contribute this in such a short time. 🎉 Could you please also fill out the “Is it tested? How?” section in the PR template? It’ll help reviewers and future contributors verify the change more easily. Thanks again for your effort! 🙌 |
|
I tested this PR and it seems to work.
config:
workspace:
backupCronJob:
enable: true
schedule: "*/3 * * * *"
|
0bc74b1 to
8427ba5
Compare
|
/retest |
| // A registry where backup images are stored. Images are stored | ||
| // in {registry}/backup-${DEVWORKSPACE_NAMESPACE}-${DEVWORKSPACE_NAME} | ||
| // +kubebuilder:validation:Optional | ||
| Registry string `json:"registry,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if registry is not public and requires authentication and/or certificate to access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am currently working on the second phase of this feature, where I cover all the use cases, including authentication. I wanted to submit a PR as early as possible to get initial feedback. The auth part should be ready soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The registry authorization was added to the controller. You can check the latest commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I think it make sense to move Registry and RegistryAuthSecret to a dedicated structure
| return err | ||
| } | ||
|
|
||
| job := &batchv1.Job{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do believe we need a dedicated SA for this job and delegate only required permissions.
@dkwon17 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense to me, could you please take a look @Allda ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkwon17 Do you want to create a brand new SA for each namespace or for each job? Or is there any existing SA that I should use here? Also, what permissions should I delegate to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what permissions should I delegate to it?
Maybe we don't need any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latest commit, I added a separate SA for the workspace namespace and use it for the Job definition.
| backUpConfig := dwOperatorConfig.Config.Workspace.BackupCronJob | ||
|
|
||
| // Find a PVC with the name "claim-devworkspace" or based on the name from the operator config | ||
| pvcName := "claim-devworkspace" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PVC name will not always be claim-devworkspace,
There are two main types of storage strategies for DevWorkspaces, common (or, per-user), and per-workspace
Here are some more details about the storage strategies: https://eclipse.dev/che/docs/stable/administration-guide/configuring-the-storage-strategy/
For common, the default PVC name is claim-devworkspace, and for per-workspace, the PVC name is storage-<devworkspaceid>
I suggest using for example GetProvisioner to help determine the storage policy,
and to determine the PVC name, the code is currently determining that like so:
devworkspace-operator/pkg/provision/storage/commonStorage.go
Lines 58 to 65 in b61eaed
| usingAlternatePVC, pvcName, err := checkForAlternatePVC(workspace.Namespace, clusterAPI) | |
| if err != nil { | |
| return err | |
| } | |
| if pvcName == "" { | |
| pvcName = workspace.Config.Workspace.PVCName | |
| } |
devworkspace-operator/pkg/provision/storage/perWorkspaceStorage.go
Lines 61 to 65 in b61eaed
| perWorkspacePVC, err := syncPerWorkspacePVC(workspace, clusterAPI) | |
| if err != nil { | |
| return err | |
| } | |
| pvcName := perWorkspacePVC.Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a dynamic PVC logic that selects the right PVC name based on the used type. Please check again.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1530 +/- ##
==========================================
+ Coverage 34.09% 35.30% +1.21%
==========================================
Files 160 161 +1
Lines 13348 13802 +454
==========================================
+ Hits 4551 4873 +322
- Misses 8487 8599 +112
- Partials 310 330 +20 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Schedule string `json:"schedule,omitempty"` | ||
| } | ||
|
|
||
| type BackupCronJobConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkwon17
Does it make sense to create a completely new API for backup and don't use devworkspaceoperatorconfig?
| } | ||
|
|
||
| func (r *BackupCronJobReconciler) copySecret(workspace *dw.DevWorkspace, ctx context.Context, sourceSecret *corev1.Secret, logger logr.Logger) (namespaceSecret *corev1.Secret, err error) { | ||
| log := logger.WithName("copySecret") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to use a common logger name for a backup component, so it can be easily identified in the DevWorkspaceController logs.
Signed-off-by: Anatolii Bazko <[email protected]>
Signed-off-by: Anatolii Bazko <[email protected]>
Signed-off-by: Rohan Kumar <[email protected]>
…de to v0.22.1 Signed-off-by: Rohan Kumar <[email protected]>
A new backup controller orchestrates a backup process for workspace PVC. A new configuration option is added to DevWorkspaceOperatorConfig that enables running regular cronjob that is responsible for backup mechanism. The job executes following steps: - Find a workspaces - Finds out that workspace has been recently stopped - Detect a workspace PVC - Execute a job in the same namespace that does the backup The last step is currently not fully implemented as it requires running a buildah inside the container and it will be delivered as a separate feature. Issue: eclipse-che/che#23570 Signed-off-by: Ales Raszka <[email protected]>
A backup of workspace is done using Buildah and storing a content of the workspace PVC into a container image. The image is later stored in a registry and can be used to recover data. A prototype script was updated and stored under project-backup directory and is build alongside the controller. The backup job calls the script and execute following steps: - mount a volume with workspace data - build container image using buildah - push image to registry configured by the operator admin Signed-off-by: Ales Raszka <[email protected]>
A new sub-object was added to the operator config that reflect a current status of the backup controller and stores a last time the backup was executed. This value is used to determine whether a backup of the workspace is needed or if it already has been executed. Signed-off-by: Ales Raszka <[email protected]>
A backup job use a PVC name from a default value or from the config if user configured custom name. Signed-off-by: Ales Raszka <[email protected]>
The backup job can now push to registries which requires auth token. The token is provided as a secret in operator namespace and added to the operator config. Signed-off-by: Ales Raszka <[email protected]>
Signed-off-by: Ales Raszka <[email protected]>
A backup job now determines the name of pvc based on used storage type. It distinguish between different storage types (common and per-workspace) and mount the volume dynamically. Signed-off-by: Ales Raszka <[email protected]>
It turns out the capabilities from the prototype are not needed. Signed-off-by: Ales Raszka <[email protected]>
A new SA is created for the backup jobs to limit the permission to just what is necessary. Signed-off-by: Ales Raszka <[email protected]>
Signed-off-by: Ales Raszka <[email protected]>
|
/retest |
A new backup controller orchestrates a backup process for workspace PVC. A new configuration option is added to DevWorkspaceOperatorConfig that enables running regular cronjob that is responsible for backup mechanism. The job executes following steps:
The last step is currently not fully implemented as it requires running a buildah inside the container and it will be delivered as a separate feature.
Issue: eclipse-che/che#23570
What does this PR do?
What issues does this PR fix or reference?
Is it tested? How?
The feature has been tested locally and using integration tests. Following configuration should be added to the config to enable this feature:
After a config is added, stop any workspace and wait till a backup job is created.
The job creates a backup and push image to registry
PR Checklist
/test v8-devworkspace-operator-e2e, v8-che-happy-pathto trigger)v8-devworkspace-operator-e2e: DevWorkspace e2e testv8-che-happy-path: Happy path for verification integration with Che