Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workload version subcommand Fixes #33912 #33996

Merged
merged 13 commits into from
Sep 22, 2023

Conversation

Forgind
Copy link
Member

@Forgind Forgind commented Jul 13, 2023

Fixes #33912

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Workloads untriaged Request triage from a team member labels Jul 13, 2023
@@ -153,4 +153,7 @@
<data name="WorkloadInfoDescription" xml:space="preserve">
<value>Display information about installed workloads.</value>
</data>
<data name="WorkloadVersionDescription" xml:space="preserve">
<value>Display version information about installed workloads.</value>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<value>Display version information about installed workloads.</value>
<value>Display the currently installed workload version.</value>

Comment on lines 47 to 52
var workloadManifestProvider =
new SdkDirectoryWorkloadManifestProvider(Path.GetDirectoryName(Environment.ProcessPath),
Cli.Utils.Product.Version,
CliFolderPathCalculator.DotnetUserProfileFolderPath, SdkDirectoryWorkloadManifestProvider.GetGlobalJsonPath(Environment.CurrentDirectory));

return workloadManifestProvider.GetWorkloadVersion();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than creating the SdkDirectoryWorkloadManifestProvider here, can we use WorkloadInfoHelper to do so and then call a method on that class to get the workload version?

return 0;
}

Reporter.Output.WriteLine($" Workload version: {workloadInfoHelper.ManifestProvider.GetWorkloadVersion()}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should probably add this output to the ShowWorkloadsInfo method rather than changing the logic here.

Also, we should display the workload version even if no workloads are installed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it this way specifically because I thought we shouldn't display a workload version if no workloads are installed 🙂 I can undo that.

@@ -150,7 +175,7 @@ void ProbeDirectory(string manifestDirectory, string featureBand)
(string? id, string? finalManifestDirectory) = ResolveManifestDirectory(manifestDirectory);
if (id != null && finalManifestDirectory != null)
{
AddManifest(id, finalManifestDirectory, featureBand);
AddManifest(id, finalManifestDirectory, featureBand, Path.GetFileName(manifestDirectory));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the workload manifest is not side-by-side, then this won't pass the correct manifest version. In that case you would need to read the actual manifest file to get the version.

@Forgind Forgind marked this pull request as ready for review July 20, 2023 20:26
}
else if (File.Exists(Path.Combine(manifestDirectory, "WorkloadManifest.json")))
{
return (manifestId, manifestDirectory);
return (manifestId, manifestDirectory, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think returning null for the version here will cause problems when we try to compute the hash for the workload set version. So we probably need to actually read the manifest to get the version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think returning null for the version here will cause problems when we try to compute the hash for the workload set version. So we probably need to actually read the manifest to get the version.

This turned out to be a disaster when it came to tests. Tests are making invalid WorkloadManifest.json files, and when I try to parse them for a version, it crashes.

So the obvious good solution to that is to fix the tests, right? I should change the tests so they write valid WorkloadManifest.json files.

Except there are several problems. First, that's a horrible mess just from a "format a string in C#" perspective. That's made worse by the fact that this is in a number of tests, so I can't correct it in one place and then have everything else build off of that—I need to correct both the place that creates the WorkloadManifest.json file and the place that assumes it's present and correct. Second, the previous format assumed certain parameters would be present in the WorkloadManifest.json file—and that's no longer true. The format has changed, which means I can't include the same information that was previously there. Third, it occurred to me that that problem might be true going forward as well; that is, future developers may decide to change the WorkloadManifest.json format, in which case they'd have to change all these tests all over again.

I wasted several hours on this today. My eventual resolution was to throw up my hands and just put in a try/catch statement instead. I think that should work and should make the tests happy. It's not exactly ideal, since I think we should notice if there's a problem and throw an error, but I think it's a fairly good solution as far as solutions go, and much easier from an implementation perspective than what I was trying to do.

Let me know if you disagree or see a better path forward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining that the catch is there for tests that don't write valid manifests?

I'm not sure what you mean about the manifest format changing. I don't believe it has changed significantly since it shipped.

GetManifests().OrderBy(m => m.ManifestId).Select(m => $"{m.ManifestId}.{m.ManifestFeatureBand}.{m.ManifestVersion}").ToArray()
)));
StringBuilder sb = new();
for (int b = 0; b < 4 && b < bytes.Length; b++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the < 4 indicate here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably add a comment.

This is intended to be part of a version, and we do want it to be unique, hence the hash, but we don't want the version to be too long, so this truncates it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so if I'm reading this correctly, you take all the manifests, create strings of the version numbers via ManifestId, ManifestFeatureBand, and ManifestVersion. You join these strings via semi-colon, encode them as bytes, and compute a hash. Then, you truncate the hash of 256bits (64 bytes) down to only 4 bytes. It seems like a lot of work done to calculate this unique hash to then simply make it not unique by truncating it. Am I making any sense? 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! And that's a great question!

Before getting to any math, to reiterate our primary goals, we want the version to be short, almost certainly unique, and communicate information about what you have installed.

Hashes work great for precisely indicating what you have installed while being short. The version is included so you have some sense of what's going on. Hashes are great (in theory, at least) since they choose a uniformly random number that's always the same if you pass in the same input.

In particular, the first bits of the has is also uniformly random under that constraint. There's always some possibility of a hash collision, and this increases that possibility. It's very, very low if we were to take the same hash. Truncating it to just the first four bytes, the chance of any two versions colliding is 2^-32, which is almost negligible still. From the birthday paradox, you can infer that the chance that some pair will collide is much greater than you'd think, but since it also incorporates the major/minor version in plain text, there aren't that many versions that we're likely to be comparing that might collide. As an example, if there are 1000 versions that might collide, even with this truncated version, the probability that no pair will collide is:
https://www.wolframalpha.com/input?i=%28%28%282%5E32%29%21%29%2F%28%28%282%5E32%29-1000%29%21%29%29+%2F+%28%282%5E32%29%5E1000%29

So I can almost guarantee they still won't collide.

On the other hand, if we were to change the 4 to a 3, it would be:
https://www.wolframalpha.com/input?i=%28%28%282%5E24%29%21%29%2F%28%28%282%5E24-1000%29%21%29%29+%2F+%28%282%5E24%29%5E1000%29

That's still unlikely, but I'd feel a lot less confident saying they won't collide.

Does that help?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my past experience with Git commit hashes (which may or may not be applicable here), the standard for 'short' hashes is 7 characters. Even then, I'd have conflicts with the same short commit hash applying to more-than-one commit. I did some quick research, and found a hashing process that fits your intended goals. It is used to create similar IDs to what YouTube uses for videos. It is called Sqids: https://sqids.org/dotnet

If you don't want to do something like that, I'm fine with the solution here. Just pointing out other options and experiences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, this is taking 4 bytes, which ends up as 8 hexadecimal characters. So the hash it's generating is 8 characters long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, this is taking 4 bytes, which ends up as 8 hexadecimal characters. So the hash it's generating is 8 characters long.

Right. When I said '7 characters' for commit hashes, I meant 7 unique values. I believe each is in the alphanumeric range, so 26 x 2 (lower and upper case alphas) + 10 (zero to nine), which is 62. If 1 byte is represented as 2 character, that means 2 characters are holding 256 unique values (2^8 is a byte). But if you have 2 alphanumeric characters, that would be 62^2 which is 3844 unique combinations. This is not my area of expertise, but as long as the solution is vetted and makes sense for this scenario, I'm fine with it. Again, this conversation was because the fact that < 4 seemed suspicious to me. 😋

@Forgind Forgind changed the base branch from main to release/8.0.1xx September 12, 2023 21:29
@Forgind Forgind changed the base branch from release/8.0.1xx to main September 12, 2023 21:31
@Forgind Forgind changed the base branch from main to release/8.0.1xx September 12, 2023 21:41
Copy link
Member

@dsplaisted dsplaisted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can go into 8.0.200. Since we don't have workload sets in 8.0.100, I don't think there's a whole lot of value to this until we add them.

}
else if (File.Exists(Path.Combine(manifestDirectory, "WorkloadManifest.json")))
{
return (manifestId, manifestDirectory);
return (manifestId, manifestDirectory, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining that the catch is there for tests that don't write valid manifests?

I'm not sure what you mean about the manifest format changing. I don't believe it has changed significantly since it shipped.

Comment on lines +71 to +73
Directory.Delete(Path.Combine(_manifestRoot, "8.0.100"), recursive: true);
Directory.Delete(Path.Combine(_manifestRoot, "8.0.200"), recursive: true);
Directory.Delete(Path.Combine(_manifestRoot, "8.0.200-rc.2"), recursive: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be necessary to delete anything here, each test should use a test-specific directory.

[InlineData(false)]
public void ItShouldReturnTheWorkloadVersion(bool useWorkloadSet)
{
Initialize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Initialize();
Initialize(identifier: useWorkloadSet.ToString());


if (useWorkloadSet)
{
CreateMockWorkloadSet(_manifestRoot, "8.0.200", "8.0.200", @"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should change the workload set version to something that is different than the feature band version (such as 8.0.202), to help verify the right version is being returned.

@Forgind Forgind merged commit e5ed3d7 into dotnet:release/8.0.1xx Sep 22, 2023
16 checks passed
@Forgind Forgind deleted the workload-version branch September 22, 2023 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Workloads untriaged Request triage from a team member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add dotnet workload --version command, display workload version other "info" commands
3 participants