Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce time of lock holding by flush #2009

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

reduce time of lock holding by flush #2009

wants to merge 3 commits into from

Conversation

phiSgr
Copy link
Collaborator

@phiSgr phiSgr commented Jan 30, 2025

No description provided.

Copy link

vercel bot commented Jan 30, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
framework-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 8:32pm
framework-landing ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 8:32pm
moose-logs-frontend ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 8:32pm
moose-product-analytics ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 8:32pm

@@ -120,6 +121,59 @@ impl<C: ClickHouseClientTrait + 'static> Inserter<C> {
Ok(())
}

async fn get_front_batch(&self) -> Option<Batch> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this removes the at least once delivery no?

Can you explain why it doesn't?

The thought is that if another process claims the queue in parallel and that batch goes through then if the next batch gets committed then the intermediary batch won't get replayed.

Copy link
Collaborator Author

@phiSgr phiSgr Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think there's only one spawned task doing the flush
if that insert failed, the batch gets returned to the queue, and consumed by that same flushing process

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like if we merge this we need to a test that shows we don't drop data since that's the current implementation just got

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a manual test on your laptop where you send a bunch of things kill the process, restart and see records not get dropped

@callicles
Copy link
Collaborator

What should we do about this?

@phiSgr
Copy link
Collaborator Author

phiSgr commented Feb 18, 2025

I do think it's worth merging. I'll tweak the code to simulate failures to test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants