Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target replacement support for VCRs with multiple sub_volumes #1551

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

leftwo
Copy link
Contributor

@leftwo leftwo commented Nov 5, 2024

Updated the volume layer code to support replacement of a target in a VCR that has multiple sub_volumes in it.
Before this we only supported a single sub_volume.

Most of the changes here are for tests. I've added a loop around the VCR replacement tests to create multiple sub volumes and iterate through the test for different VCR configurations.

In addition, it is now required that the generation number change for all sub-volumes when any target is changed.

Some tests that were duplicates have been removed.

Updated the volume layer code to support replacement of a target in
a VCR that has multiple sub_volumes in it.  Before this we only
supported a single sub_volume.

Most of the changes here are for tests.  I've added a loop around vcr
replacement tests to create multiple sub volumes and iterate through
the test for different VCR configurations.

In addition, it is now required that the generation number change for
all sub-volumes when any target is changed.
@leftwo leftwo requested review from jmpesp and mkeeter November 5, 2024 21:42
upstairs/src/volume.rs Outdated Show resolved Hide resolved
upstairs/src/volume.rs Outdated Show resolved Hide resolved
for (o_sv, new_sv) in
o_sub_volumes.iter().zip(n_sub_volumes.iter())
{
let sv_res = Self::compare_vcr_for_replacement(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any worries about unbounded recursion blowing up the stack here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But, I'm not sure what to do about it.
We currently don't have any limits in Nexus, so you could craft VCRs forever until something broke. I don't know what the limit is currently and what will break first.

upstairs/src/volume.rs Outdated Show resolved Hide resolved
upstairs/src/volume.rs Outdated Show resolved Hide resolved
upstairs/src/volume.rs Outdated Show resolved Hide resolved
upstairs/src/volume.rs Outdated Show resolved Hide resolved
upstairs/src/volume.rs Outdated Show resolved Hide resolved
upstairs/src/volume.rs Outdated Show resolved Hide resolved
upstairs/src/volume.rs Outdated Show resolved Hide resolved
upstairs/src/volume.rs Outdated Show resolved Hide resolved
Cleaned up some comments.
Created a function build_replacement_subvol that will create
Vec<VolumeConstructionRequest> of Regions based on input provided.

This should reduce a bunch of common code.

Converted a bunch of other replacement subvolume creation to be done
immutably.
Copy link
Contributor

@mkeeter mkeeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The only follow-up suggestion (optional) is to consolidate "test on every ReadOnlyParentMode", e.g.

impl ReadOnlyParentMode
    fn test_all<F: FnMut(Self)>(mut f: F) {
        f(ReadOnlyParentMode::OnlyOriginal);
        f(ReadOnlyParentMode::OnlyReplacement);
        f(ReadOnlyParentMode::Neither);
        f(ReadOnlyParentMode::Both);
    }
}

...which you could then use like

        for sv in 1..4 {
            for sv_changed in 0..sv {
                for cid in 0..3 {
                    ReadOnlyParentMode::test_all(|mode|
                        test_volume_replace_inner(
                            sv,
                            sv_changed,
                            cid,
                            mode,
                        )
                    );
                }
            }
        }

upstairs/src/volume.rs Outdated Show resolved Hide resolved
block_size,
blocks_per_extent,
extent_count,
opts: opts.clone(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The targets should be randomized here so that all sub volumes are not the same, using something like https://docs.rs/ipgen/1.0.2/ipgen/fn.ip.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm holding it wrong, somehow...

I get back:
Err(InvalidIpNetwork("[BUG] failed to parse the generated IP address a00:67 as IPv4"))

And, the source says:

    // This error should never happen but I'm not a fan of panicking in libraries
    .map_err(|_| Error::InvalidIpNetwork(format!("[BUG] failed to parse the generated IP address `{}` as IPv4", ipv6_addr.trim_start_matches(':'))))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it is a bug in the latest release 1.0.2.
Fixed in main, but they have not rolled a new release out that contains the fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found another library that could generate random I.P. addresses for me and used that.

block_size,
blocks_per_extent,
extent_count,
opts: replacement_opts.clone(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This too will have multiple subvolume targets that are the same

// A valid replacement VCR is provided with a larger generation
// number and only one target being different.
let vol_id = Uuid::new_v4();
let opts = generic_crucible_opts(vol_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe generic_crucible_opts should also randomize the addresses?

leftwo and others added 4 commits January 10, 2025 11:33
@leftwo
Copy link
Contributor Author

leftwo commented Jan 27, 2025

This is ready to go back, but we are holding it for the moment because this may require some changes on the Omicron side.
The changes would be that any change to a virtual disk target (a replacement) would require bumping the generation number on all Sub_Volumes that make up that virtual disk, including replacement of a read_only_parent target. We need to determine if Omicron already does this, or if it needs updates, or if we need to relax this constraint for read_only_parent targets.

Also, this work should come behind oxidecomputer/omicron#6172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants