Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor NullBufferBuilder to use an internal state machine #7082

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Jefffrey
Copy link
Contributor

@Jefffrey Jefffrey commented Feb 6, 2025

During the course of #7013 I realized how confusing the internals of NullBufferBuilder are, how some private fields are applicable only when the builder is in a certain state; this isn't enforced in the code, rather via comments (and knowledge of the struct).

Experimenting with refactoring it to use an internal state machine to explicitly delineate the different states it can have and therefore enforce at compile time the rules which were previously implicit before.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Feb 6, 2025
Copy link
Contributor Author

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving as draft for now as I haven't yet run the benchmarks (need to find the appropriate ones first, or create new ones) to see if there's impact on performance

Note to self: use critcmp

@@ -83,7 +83,7 @@ impl BooleanBufferBuilder {
self.len == 0
}

/// Returns the capacity of the buffer
/// Returns the capacity of the buffer (in bits)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and accompanying changes to how this is handled by callers) is technically unrelated to original intent; just something I picked up, can probably be extracted to a separate PR

Comment on lines +205 to +208
// TODO: Only usage of self.initial_capacity here. Should we be using it,
// or instead should use bitmap_builder.capacity()? Considering
// self.initial_capacity <= bitmap_builder.capacity()
capacity: self.initial_capacity,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to preserve the current behaviour of finish() where it resets to the original capacity provided. This seems wrong to me as ideally would want to reset to the current capacity, which may have increased since it was originally specified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant