Skip to content

Merged FileTypePrefix into FilePrefix #5478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 16, 2025

Conversation

dlmarion
Copy link
Contributor

Closes #5476

@dlmarion dlmarion added this to the 4.0.0 milestone Apr 15, 2025
@dlmarion dlmarion self-assigned this Apr 15, 2025
BULK_IMPORT("I"),
MAJOR_COMPACTION("C"),
MAJOR_COMPACTION_ALL_FILES("A"),
MERGING_MINOR_COMPACTION("M");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we get rid of merging minor compactions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did, but I think it's possible that we still run across these files when reading. The code does not create files starting with an 'M' anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably include an inline comment here, then, so we can just document it as a reserved due to its use in the past.

Comment on lines 58 to 73
Objects.requireNonNull(prefix, "prefix must be supplied");
Preconditions.checkArgument(!prefix.isBlank(), "Empty prefix supplied");
Preconditions.checkArgument(prefix.length() == 1, "Invalid prefix supplied: " + prefix);
switch (prefix.toUpperCase()) {
case "A":
return MAJOR_COMPACTION_ALL_FILES;
case "C":
return MAJOR_COMPACTION;
case "F":
return MINOR_COMPACTION;
case "I":
return BULK_IMPORT;
case "M":
return MERGING_MINOR_COMPACTION;
default:
throw new IllegalArgumentException("Unknown prefix type: " + prefix);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You changed this check so it's no longer case-sensitive. Why? If there's a typo in the prefix, I think we would want to cause an error, not allow it to go through silently... possibly returning a type that was not intended.

Also, the old code is much more future-proof, and we don't need to maintain a case statement each time the enum set changes. I don't think the fixed-size stream over the enum values array is performance-critical, and should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I restored the case sensitivity in 5a5905f. I also modified the method to loop over the enum values so that it should be more future-proof.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the fixed-size stream over the enum values array is performance-critical, and should be fine.

If I had my way, I would change all the occurrences of streams in the entire code base back to for-loops. While the streams api may be more intuitive for some, everyone knows what a for loop is. We are choosing to use an api that we know is slower. If we were developing a calendar app, I would agree that performance is not critical. For a database, I think performance is critical and should be a design goal. On a lightly loaded system the implementation of a non-critical section of code doesn't matter. On a highly loaded system, the implementation of a non-critical section of code is preventing other code from running.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind changing this instance to a for loop, but I do find the 12 new lines much more verbose and much less readable than the succinct 2 lines that were there before, and don't see this as an overall improvement. I think it would probably be more clearly an improvement if it didn't have the redundant checks for the special cases being checked before the loop, since those are already encapsulated by the check at the end of the loop. Those cases should be rare, so there's not an optimization to bypass the loop for them. The verbosity of the checks undermines any value the more specific error messages might have offered. I think the succinct check at the end is sufficient.

@dlmarion dlmarion merged commit 39ab610 into apache:main Apr 16, 2025
8 checks passed
@dlmarion dlmarion deleted the 5476-merge-file-prefix-types branch April 16, 2025 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge FileTypePrefix and FilePrefix
3 participants