-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd: use pre-existing volume group if content matches #5221
rbd: use pre-existing volume group if content matches #5221
Conversation
65987ab
to
12f7dce
Compare
Manually testing by creating a VolumeGroup with the
Then creating a VolumeGroupSnapshot with the same volumes fails:
The last line from the Ceph-CSI logs suggest that the volumes are attempted to be added to a wrong group:
|
27e9df7
to
6c7e7ed
Compare
The current state seems to work well. Only the Ceph main branch supports the |
6c7e7ed
to
0f6445e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a tracker to take care of cases where VolumeGroupReplication may modify the group during groupSnapshot create, delete or in between those requests in the future
1925c37
to
ebf62d7
Compare
See #5232. |
internal/rbd/group.go
Outdated
return librbd.GroupImageRemove(ioctx, id, rv.ioctx, rv.RbdImageName) | ||
} | ||
|
||
// GetVolumeGroupID returns the name of the VolumeGroup where this rbdVolume |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// GetVolumeGroupID returns the name of the VolumeGroup where this rbdVolume | |
// GetVolumeGroupID returns the ID of the VolumeGroup where this rbdVolume |
internal/rbd/group.go
Outdated
return fmt.Errorf("could not get id for volume group %q: %w", vg, err) | ||
} | ||
|
||
return librbd.GroupImageRemove(ioctx, id, rv.ioctx, rv.RbdImageName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GroupImageRemove accepts the group name and not id, so shouldn't we continue to fetch the name and pass it to the API? Or, am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming has been a bit confusing to me as well. The GetID()
function returns the generated unique name of the rbd-group (something like csi-group-<UUID>
), the GetName()
function returns the name that was requested by the caller with the CreateVolumeGroup
procedure (stored in the journal and pointing to the generated name/id of the real RBD-group).
Throughout Ceph-CSI there are many id
fields, and the naming has not always been consistent. This commit corrects an earlier mistake that I found when manually creating a volume-group with a name my-group
and trying to use that to create a VolumeGroupSnapshot of. ID/name was incorrect, with this change it looks way cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a better description in internal/rbd/types.journalledObject
for these functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the confusion is on my part. The GetID()
function returns the CSI-handle formatted unique ID, it can be parsed and used to obtain the details from the journal, which then point to the actual object in the backend (RBD-image or RBD-group name). Updated the documentation for internal/rbd/types.journalledObject
again.
if vol == nil { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nixpanic we dont get to this state isnt it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not normally, but we can. The commit that adds this part has the following commit message:
When an incorrect volumeID is passed while creating a VolumeGroup, the
.Destroy()
function causes a panic. With this extra check to skip
nil
volumes, the panic does not happen anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can probably also be solved by creating an empty volumes
slice, and appending each volume after it has been resolved. If that is something you prefer, I can take that approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense, we could avoid this check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, please check again.
return nil, err | ||
// reuse the existing group, it contains the volumes already | ||
// check for pre-existing volume group name | ||
vgHandle, err := volumes[0].GetVolumeGroupID(ctx, mgr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length of volumes can be 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes, I think that might be possible. I'll add a check for that.
ebf62d7
to
24a843e
Compare
24a843e
to
a80af79
Compare
a80af79
to
3068c73
Compare
LGTM, |
3068c73
to
fb4e754
Compare
This is now based on #5184, making sure both PRs can be merged together nicely. |
@Mergifyio rebase |
When an incorrect volumeID is passed while creating a VolumeGroup, the `.Destroy()` function caused a panic. By appending each volume to the volumes slice, the slice won't contain any `nil` volumes anymore. Signed-off-by: Niels de Vos <[email protected]>
The `prefix` is passed to several functions, but it can easily be obtained with a small helper function. This makes calling the functions a little simpler. Signed-off-by: Niels de Vos <[email protected]>
The Manager.MakeVolumeGroupID() function can be used to build a CSI VolumeGroupID from the backend (pool and name of the RBD-group). This will be used when checking if an RBD-image belongs to a group already. It is also possible to resolve the VolumeGroup by passing the VolumeGroupID to the existing Manager.GetVolumeGroupByID() function. Signed-off-by: Niels de Vos <[email protected]>
CompareVolumesInGroup() verifies that all the volumes are part of the given VolumeGroup. It does so by obtaining the VolumeGroupID for each volume with GetVolumeGroupByID(). The helper VolumesInSameGroup() verifies that all volumes belong to the same (or no) VolumeGroup. It can be called by CSI(-Addons) procedures before acting on a VolumeGroup. Signed-off-by: Niels de Vos <[email protected]>
When a VolumeGroup has been created through the CSI-Addons API, the VolumeGroupSnapshot CSI API will now use the existing VolumeGroup. There are checks in place to validate that the Volumes in the VolumeGroup match the Volumes in the VolumeGroupSnapshot request. Signed-off-by: Niels de Vos <[email protected]>
The `GetID()` and `GetName()` functions can be confusing, as names and ID's are not always distinctive enough. The name is used to reference an object that exists in a pool. The ID the CSI-handle formatted and can be used to locate the entry for the object in the journal. Signed-off-by: Niels de Vos <[email protected]>
✅ Branch has been successfully rebased |
fb4e754
to
a771326
Compare
/test ci/centos/mini-e2e-helm/k8s-1.32 |
/test ci/centos/mini-e2e-helm/k8s-1.31 |
/test ci/centos/mini-e2e-helm/k8s-1.30 |
/test ci/centos/mini-e2e/k8s-1.32 |
/test ci/centos/mini-e2e/k8s-1.31 |
/test ci/centos/mini-e2e/k8s-1.30 |
/test ci/centos/k8s-e2e-external-storage/1.32 |
/test ci/centos/k8s-e2e-external-storage/1.31 |
/test ci/centos/k8s-e2e-external-storage/1.30 |
/test ci/centos/upgrade-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
/retest ci/centos/mini-e2e/k8s-1.31 |
Failed while deploying minikube cluster logs |
CSI-Addons Volume Groups makes it possible to create a long-living volume group. This is incompatible with the VolumeGroupSnapshot feature, as that creates a temporary VolumeGroup before creating a snapshot of it. When a Volume is already part of a different Volume Group, things will fail.
In order to make both features compatible with each other, the following things are required:
Depends-on: #5184