Skip to content

[improve] support combine flush async #74

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 26, 2025
Merged

Conversation

JNSimba
Copy link
Member

@JNSimba JNSimba commented May 14, 2025

When this is enabled, enable.combine.flush=true . Currently, streamload is still synchronous, which will block the consumption of kafka data while writing. This pr will change it to asynchronous to improve throughput.

@DongLiang-0
Copy link
Contributor

In this asynchronous data flushing mechanism, if an exception occurs during asynchronous import, after getOffset is returned to Kafka, will the data be lost?

@JNSimba
Copy link
Member Author

JNSimba commented May 15, 2025

In this asynchronous data flushing mechanism, if an exception occurs during asynchronous import, after getOffset is returned to Kafka, will the data be lost?

@DongLiang-0 Thanks for the reminder. I added a logic. When sink.commit is called, I will check if there is a flushException. If there is an exception, I will throw an error directly.

@JNSimba JNSimba merged commit 0813cb5 into apache:master May 26, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants