Skip to content

[reconfigurator-cli] RoT bootloader support #8620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

karencfv
Copy link
Contributor

@karencfv karencfv commented Jul 17, 2025

Equivalent of #8273 for RoT bootloader

@karencfv karencfv marked this pull request as ready for review July 18, 2025 01:23
Copy link
Contributor

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks -- a few questions.

pub fn sled_stage0_version(
&self,
sled_id: SledUuid,
) -> anyhow::Result<Option<&str>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Option mean here -- i.e. under what circumstances is it possible to return None? And can this return an &ArtifactVersion to match what sled_update_rot_bootloader_versions accepts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being honest, I was just copying what @davepacheco had done for the SP

pub fn sled_sp_active_version(
&self,
sled_id: SledUuid,
) -> anyhow::Result<Option<&str>> {
let sled = self.sleds.get(&sled_id).with_context(|| {
format!("attempted to access sled {} not found in system", sled_id)
})?;
Ok(sled.sp_active_caboose().map(|c| c.version.as_ref()))
}

That said, I think I have an idea as to why he did it this way. These methods are only used for the cmd_sled_show function. If I were to return ArtifactVersion, then I would have to parse the result back to a string within the cmd_sled_show function, otherwise the output would look like this:

RoT bootloader stage 0 version:   Some(ArtifactVersion("0.0.1"))

instead of:

SP active version: Some("0.0.1")
SP inactive version: None

As to why he used an option, I'm not entirely sure. I'd like to keep it like this for now, just so it matches what the SP does. When @davepacheco comes back, I'll ask him why it was an option in the first place, and happy to change it if it'd be better to not have the Option. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing for myself (tell me if this sounds wrong): this structure is a model of the state of a simulated system (though it may have been seeded from state from a real system). The only things it does with SP versions is (1) initialize them based on whatever the caller provides, (2) allow the caller to change them via a simulated SP firmware update, and (3) report the current version.

sled_sp_active_version() returns None if the underlying sp_active_caboose field is None. I believe the only way this can happen is if this simulated system was seeded from a reconfigurator state file using an inventory collection that's missing information about this SP. I think the only way you can get here today is if you load a state file with reconfigurator-cli and tell it to use an inventory collection that's missing this SP, though there's no reason another code path couldn't hit it. (This code path winds up going through sled_full().)

I don't remember if there's a particularly important reason to support this aside from the fact that inventory collections can always be missing stuff due to transient failures and it could be annoying if reconfigurator-cli couldn't do anything with such a collection. It may have been just that it seemed pretty easy to support, given how little we do with this field.

@@ -596,6 +633,43 @@ impl SystemDescription {
part_number: sp_state.model.clone(),
serial_number: sp_state.serial_number.clone(),
};
if let Some(stage0) = &s.stage0_caboose() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the & required here and in stage0_next_caboose below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha! it is not 😄

Comment on lines +882 to +883
pub stage0: Option<Arc<nexus_types::inventory::Caboose>>,
pub stage0_next: Option<Arc<nexus_types::inventory::Caboose>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same q here I guess -- can we prefill these values to something reasonable? In production we always have these filled out, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I'd like to ask @davepacheco why it was done this way with the SPs to begin with

Copy link
Contributor Author

@karencfv karencfv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to review!

pub fn sled_stage0_version(
&self,
sled_id: SledUuid,
) -> anyhow::Result<Option<&str>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being honest, I was just copying what @davepacheco had done for the SP

pub fn sled_sp_active_version(
&self,
sled_id: SledUuid,
) -> anyhow::Result<Option<&str>> {
let sled = self.sleds.get(&sled_id).with_context(|| {
format!("attempted to access sled {} not found in system", sled_id)
})?;
Ok(sled.sp_active_caboose().map(|c| c.version.as_ref()))
}

That said, I think I have an idea as to why he did it this way. These methods are only used for the cmd_sled_show function. If I were to return ArtifactVersion, then I would have to parse the result back to a string within the cmd_sled_show function, otherwise the output would look like this:

RoT bootloader stage 0 version:   Some(ArtifactVersion("0.0.1"))

instead of:

SP active version: Some("0.0.1")
SP inactive version: None

As to why he used an option, I'm not entirely sure. I'd like to keep it like this for now, just so it matches what the SP does. When @davepacheco comes back, I'll ask him why it was an option in the first place, and happy to change it if it'd be better to not have the Option. WDYT?

@@ -596,6 +633,43 @@ impl SystemDescription {
part_number: sp_state.model.clone(),
serial_number: sp_state.serial_number.clone(),
};
if let Some(stage0) = &s.stage0_caboose() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha! it is not 😄

Comment on lines +882 to +883
pub stage0: Option<Arc<nexus_types::inventory::Caboose>>,
pub stage0_next: Option<Arc<nexus_types::inventory::Caboose>>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I'd like to ask @davepacheco why it was done this way with the SPs to begin with

@karencfv karencfv requested a review from sunshowers July 21, 2025 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants