Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Add Pure Storage storage driver #14599

Merged
merged 51 commits into from
Feb 4, 2025

Conversation

MusicDin
Copy link
Member

@MusicDin MusicDin commented Dec 6, 2024

This PR adds pure storage driver for Pure Storage remote storage.
Driver supports communication over either iSCSI or NVMe/TCP.

@github-actions github-actions bot added the Documentation Documentation needs updating label Dec 6, 2024
@MusicDin MusicDin force-pushed the feat/pure-driver branch 2 times, most recently from c53014e to 69f6cae Compare December 6, 2024 16:27

Create a storage pool named `pool1` that uses NVMe/TCP by default:

lxc storage create pool1 pure pure.gateway=https://<purestorage-address> pure.api.token=<purestorage-api-token> pure.array.address=<nvme-ip>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That gateway HTTPS verification presumably relies on the OS provided trust store so it would work if it's an "official TLS" cert like one signed by Let's Encrypt.

I must admit not knowing how to provide a different trusted CA if one doesn't have an official cert.

doc/metadata.txt Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
doc/reference/storage_pure.md Outdated Show resolved Hide resolved
@MusicDin
Copy link
Member Author

MusicDin commented Dec 9, 2024

@minaelee Thanks for the comments. I've addressed all of them, but will ask you for another review just before we merge the PR, as there may still be some changes to the docs. Thanks again :)

@MusicDin MusicDin marked this pull request as ready for review December 9, 2024 16:07
@MusicDin MusicDin requested a review from roosterfish December 9, 2024 16:07
lxd/storage/drivers/driver_pure.go Outdated Show resolved Hide resolved
doc/reference/storage_drivers.md Outdated Show resolved Hide resolved
Correctly initialize Pure Storage pool.

Signed-off-by: Din Music <[email protected]>
Currently, Pure Storage driver does not support recovery. Mainly bacuse the storage volume names
are encoded, which would result in indistinguishable storage volume names after the recovery.

Signed-off-by: Din Music <[email protected]>
Signed-off-by: Din Music <[email protected]>
Signed-off-by: Din Music <[email protected]>
return err
}

// Ensure temporary snapshot volume is remooved in case of an error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a blocker to merge, but typo here


// A Pure Storage snapshot cannot be mounted. To mount a snapshot, a new volume
// has to be created from the snapshot.
err = d.client().copyVolumeSnapshot(snapVol.pool, parentVolName, snapVolName, snapVol.pool, snapVolName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MusicDin question for you:

What happens if I've got a snapshot mounted (so a temporary volume has been created), and then lxd gets stopped and the host restarted uncleanly, and i go to mount the snapshot again, will this call fail because the temporary volume already exists?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyVolumeSnapshot creates a new volume, or overwrites an existing one if it already exists.

Mounting is handled by the NVMe/iSCSI if host is mapped with a certain volume.
When we attempt to create a mapping, the error is ignored if it already exists, so it cannot fail there.

The only potential issue I can see is that if LXD is abruptly shut down, it will not cleanup the temporary snapshot. For example, if the instance is then removed, this temporary snapshot could remain in Pure Storage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Until the pool itself is removed, then everything is wiped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or overwrites an existing one if it already exists.

OK cool,that was the bit I was mainly interested in.

For example, if the instance is then removed, this temporary snapshot could remain in Pure Storage.

Can we scan volumes on instance (or snapshot) delete to find any temporary volumes related and delete them too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we scan volumes on instance (or snapshot) delete to find any temporary volumes related and delete them too?

I cannot foreseen anything that would block us from doing that.
What is your suggestion, at which phase should this happen? Should we run the cleanup on LXD boot / when storage is loaded?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when deleting the instance snapshot, and when deleting the instance (which by extension is deleting the snapshot)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yeah of course. So in practice this would mean when we are deleting a specific snapshot, we check if a volume exists with such uuid.

For example:

  1. Delete snapshot v1/snap0
  2. Check if there is an existing volume with snap0-uuid (obviously correctly resolving volume prefixes and suffixes in the process)
  3. Delete such volume if exists
  4. Delete an actual snapshot snap0


Create a storage pool named `pool4` that uses NVMe/TCP to connect to Pure Storage array via specific target addresses:

lxc storage create pool4 pure pure.gateway=https://<pure-storage-address> pure.api.token=<pure-storage-api-token> pure.mode=iscsi pure.target=<target_address_1>,<target_address_2>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uses NVMe/TCP to connect

pure.mode=iscsi

copy pasto? Please can you follow up in separate PR these small fixes.

Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks! There's a couple of minor things to follow up with, but ill merge this as-is.

@tomponline tomponline merged commit fb787af into canonical:main Feb 4, 2025
27 of 28 checks passed
@MusicDin MusicDin deleted the feat/pure-driver branch February 4, 2025 09:37
tomponline added a commit that referenced this pull request Feb 4, 2025
Pure Storage does not allow mounting snapshots directly. Instead, we
create a temporary volume with the snapshot's volume name. If the LXD
was abruptly closed when temporary volume is being used, it may remain
in the Pure Storage until the pool is removed.

To prevent such situation, when snapshot is removed also remove the
potentially existing temporary snapshot volume.

Addresses comment:
#14599 (comment)
tomponline added a commit that referenced this pull request Feb 12, 2025
Closes #14814

This aligns the PowerFlex driver with the implementation of Pure in
#14599 to use the NVMe/TCP
`Connect` and `Disconnect` functions instead of their
`(Connect|Disconnect)All` counterparts.

Furthermore a patch is introduced to remove the `powerflex.sdt` config
key from already existing storage pools.
Currently only a single address is stored in the key and using `nvme
connect-all` all of the other targets (SDTs) are discovered anyway.
As the driver is now querying for all the SDTs in case `powerflex.sdt`
config key is not set, the behavior is the exact same.
In addition `powerflex.sdt` now accepts multiple addresses separated by
comma which allows overriding the addresses discovered by asking the
PowerFlex REST API.

Some additional minor tweaks have been put into place to make more
consistent use of the connector's `LoadModules` in case of SDC and to
clean up some error messages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes to the REST API Documentation Documentation needs updating
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants