Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup block indices for missing blocks #15040

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open

Conversation

kasey
Copy link
Contributor

@kasey kasey commented Mar 12, 2025

What type of PR is this?

Bug fix

What does this PR do? Why is it needed?

For any existing nodes impacted by the block cleanup bug fixed recently, this is a hack to help those nodes self-heal and avoid a complete db wipe and resync.

**Notes for reviewers **

blockParentRootIndicesBucket maps from a block root to its children. When the db is in this state, we don't have the parent root of the missing block to clean up the index going the other direction. As I was wrapping this PR up I did think a couple ways to do it.

  1. Before we delete the slot -> root index, we can seek a db cursor to the key for the slot of the dangling block. We can then use Prev to look at the previous slot, which should contain the parent (there can be multiple block roots backed into the slot->root index). Check each of those keys and remove the deleted root if found.
  2. Since we're looping over slot/root pairs in blocksForSlotRange, we could keep track of roots in the previous slot so that when we add something to the badBlocks slice for cleanup we have parent roots to check for the missing root. If the block root is in the first slot of the range we won't have seen its possible parents and need to fall back to method 1, so this is really just an optimization for method 1 and might not be worth the complexity.

Acknowledgements

@kasey kasey requested a review from a team as a code owner March 12, 2025 21:19
@kasey kasey requested review from potuz, nisdas and james-prysm March 12, 2025 21:19
@kasey kasey force-pushed the repair-idx-on-read branch from c9fa7ad to c4aa1da Compare March 14, 2025 21:52
@kasey kasey changed the title WIP: cleanup block indices for missing blocks cleanup block indices for missing blocks Mar 15, 2025
@kasey kasey force-pushed the repair-idx-on-read branch from cb0e184 to 137d9b6 Compare March 17, 2025 21:46
// - Seeking before the beginning of the bucket (the `start` param is lower than the lowest slot).
// In both of these cases k,v will be nil and we can handle the same way using `edge` to
// - continue to the next iteration. If the following Prev() key/value is nil, Prev has gone past the beginning.
// - Otherwise, iterate as usual.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a unit test for these particular edge cases ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, TestStore_Blocks_FiltersCorrectly that touch the upper bounds of the blocks with SetStartSlot + SetEndSlot. LMK if that is sufficient, or if you think we need a separate test.

Comment on lines +139 to +152
func (q *QueryFilter) SimpleSlotRange() (primitives.Slot, primitives.Slot, bool) {
if len(q.queries) != 2 || q.queries[StartSlot] == nil || q.queries[EndSlot] == nil {
return 0, 0, false
}
start, ok := q.queries[StartSlot].(primitives.Slot)
if !ok {
return 0, 0, false
}
end, ok := q.queries[EndSlot].(primitives.Slot)
if !ok {
return 0, 0, false
}
return start, end, true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good candidate for unit test

nisdas
nisdas previously approved these changes Mar 19, 2025
Copy link
Member

@nisdas nisdas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kasey kasey force-pushed the repair-idx-on-read branch 3 times, most recently from b9a428d to 360b869 Compare March 25, 2025 22:42
@kasey kasey changed the base branch from develop to stategen-test-refactor March 28, 2025 17:56
@kasey kasey force-pushed the repair-idx-on-read branch from 360b869 to 2c57ad0 Compare March 28, 2025 17:58
nisdas
nisdas previously approved these changes Mar 28, 2025
Base automatically changed from stategen-test-refactor to develop March 28, 2025 23:41
@kasey kasey dismissed nisdas’s stale review March 28, 2025 23:41

The base branch was changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants