fix(actions): make async_commit_enabled actually use async offset commits#16776
Open
max-datahub wants to merge 1 commit intomasterfrom
Open
fix(actions): make async_commit_enabled actually use async offset commits#16776max-datahub wants to merge 1 commit intomasterfrom
max-datahub wants to merge 1 commit intomasterfrom
Conversation
The `ack()` method in `KafkaEventSource` had a logic error where the condition `if processed or not self.source_config.async_commit_enabled` short-circuited to synchronous commits whenever `processed=True` — which is the case for every normally processed event. This made the `async_commit_enabled` config flag effectively dead code. The constructor correctly configured librdkafka for the "manual store + auto commit" pattern (enable.auto.offset.store=False, enable.auto.commit=True), but `ack()` never reached the `_store_offsets()` path for processed events. The fix changes the condition to branch solely on `async_commit_enabled`, so sync mode always does synchronous commits and async mode always stores offsets for periodic background commit. Benchmarked at 25x throughput improvement (326 -> 8,200 events/sec on a single pod with a noop action) when async commits work correctly. Made-with: Cursor
949d679 to
847ea3c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
async_commit_enabledconfiguration flag inKafkaEventSourcehas been non-functional sincedatahub-actionswas moved into the OSS repo (#13120). When enabled, it was supposed to switch from per-event synchronous Kafka offset commits (~3ms each) to periodic background commits via librdkafka's auto-commit mechanism. Instead, due to a logic error inack(), synchronous commits were always performed for processed events regardless of the flag.The Bug
The
ack()method had this condition:When
processed=True(which is the case for every normally processed event — the pipeline coercesNonetoTrueinpipeline.py:run()), theorshort-circuits and the synchronous commit path is always taken, regardless ofasync_commit_enabled.processedasync_commit_enabledprocessed or not asyncThe async path was only reachable when
processed=False AND async_commit_enabled=True— a combination that essentially never occurs in practice.The Fix
The condition now branches solely on
async_commit_enabled:The constructor already correctly configured librdkafka for the "manual store + auto commit" pattern (
enable.auto.offset.store=False,enable.auto.commit=True). Only theack()method needed fixing.Impact
async_commit_enableddefaults toFalse, so the synchronous commit path remains the default.async_commit_enabled: truewill now actually get asynchronous commits, with offsets stored locally and committed periodically by librdkafka's background thread.async_commit_intervalmilliseconds of events may be redelivered after a consumer crash. For idempotent actions this is a non-issue. The interval is configurable (default 10s).Benchmark Results
This fix has been tested and verified at scale on an EKS cluster with MSK (kafka.t3.small × 2, 5 partitions):
Benchmark used a noop action (counter only — no I/O, no payload parsing) to isolate framework overhead from action logic. This measures the upper-bound framework ceiling: Kafka poll → Avro deserialization → event routing → offset commit.
Test Plan
ack()covering all 4 combinations ofasync_commit_enabled×processedasync_mode + processed=Truecase hits synchronous commit instead of store)Made with Cursor