-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-17509: Introduce a delayed action queue to complete purgatory actions outside purgatory. #17177
Conversation
I've checked the 3 test failures. They are unrelated to the PR. I ran all of them locally and they all passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @adixitconfluent!
Here's my understanding of the current share fetch handling
- KafkaApis is calling into SPM to enqueue a share request
- SPM#maybeProcessFetchQueue runs recursively (! 🙀) until the queue is empty
- On each iteration, we get a share fetch request off the queue, do some validation and enqueue a DelayedShareFetch
Since adding the DelayedShareFetch to the purgatory is non-blocking, I'm pretty sure we are essentially not using the fetch queue any more. Or rather, we are now using the DelayedShareFetch purgatory as a fetch queue (which was the goal of the refactoring, after all).
For fetchQueue
I don't see too many remaining usages:
- Adding in SPM#fetchMessages (from KafkaApis)
- completeExceptionally in SPM#close
- Polling in SPM#maybeProcessFetchQueue
Since this closely matches our DelayedShareFetch usage, I'm wondering if we can remove the fetchQueue code in this PR.
WDYT?
hi @mumrah,
You're right, we don't need the fetch queue. I have created a JIRA https://issues.apache.org/jira/browse/KAFKA-17545 for it, and will prioritize it in the coming PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adixitconfluent : Thanks for the PR. Added a few comments.
core/src/main/java/kafka/server/share/SharePartitionManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 comment for my knowledge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adixitconfluent : Thanks for the updated PR. Added one more comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adixitconfluent : Thanks for the updated PR. A couple of more comments.
core/src/test/java/kafka/server/share/DelayedShareFetchTest.java
Outdated
Show resolved
Hide resolved
I was still a bit confused as to why we were completing other purgatory items from within the share fetch purgatory, so @adixitconfluent and I sync'd offline. The purpose of this was to allow other delayed share fetches to check for completion in the case of new records being produced. Instead of this, I suggested that we should include a TopicPartition key in addition to the SharPartition key when creating the delayed operation. This would let us tie into the HWM listener so we could directly complete pending share fetches when the HWM increased. This would let us avoid passing the delayed action queue and purgatory into the delayed share fetch operation. We can keep this PR scoped to adding the delayed action queue and add the produce/HWM callback in a future PR WDYT @junrao? Does this sound reasonable? |
For share fetch, even when there is no new data in the partition, currently there are situations that we may need to trigger a check on the delayed share fetch.
Currently, we need the action queue in delayed share fetch operation for 1. I am not sure that's truly needed. We could revisit that when we add the minBytes support. When there is new data in the partition, we may also need to trigger a check on the delayed share fetch. This could be done by adding a TopicPartition as the key for the delayed share fetch or somehow map a TopicPartition to an existing SharePartition key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adixitconfluent : Thanks for the updated PR. Left one comment.
I'm not sure I understand this comment. AFAIK, we only lock the share partition when acquiring records or otherwise modifying the in-flight records. I think in all these cases (acquiring new records, releasing old records, acquisition timeouts) we have the opportunity to call the purgatory to see if requests are completed. Basically, I'd like delayed share fetch to be modeled similarly to delayed fetch where the operation only has enough context to complete itself (i.e., fetch params, ReplicaManager, and a few other things) and the calls to complete the operation all happen externally. I think we can achieve this since the completion scenarios all happen in SharePartitionManager or SharePartition.
Yea, this is how I was thinking it would work. Each ShareFetchRequest would have keys for each (topic, partition) and (topic, partition, group). |
Got it. You are suggesting that instead of adding a delayed action in DelayedShareFetch.onComplete, we can add it in SharePartition.releaseFetchLock. The slight difference is that in the current approach, we can wait for all partitions' lock to be released before adding the delayed action. This potentially allows the woken up delayedShareFetch to grab the lock on more partitions. If we do it inside SharePartition.releaseFetchLock, we may lose that opportunity. To get rid of delayedAction in DelayedShareFetch, we could potentially add a method in SharePartitionManager that takes a set of partitions and does the following. We can then pass in SharePartitionManager to DelayedShareFetch so that the method can be called there?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adixitconfluent : Thanks for the updated PR. One more comment.
delayedShareFetchPurgatory.checkAndComplete( | ||
new DelayedShareFetchKey(shareFetchData.groupId(), topicIdPartition))); | ||
return BoxedUnit.UNIT; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about moving this logic to SharePartitionManager as discussed in #17177 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @junrao @mumrah , yes, I think we can move it outside DelayedShareFetch class to SharePartitionManager. Also, I'll also add TopicPartition as a key for delayed share fetch along with SharePartition (that is already present right now). I have compiled these details in JIRA https://issues.apache.org/jira/browse/KAFKA-17703. Will it be fine if we can merge this PR and I'll take up that JIRA in a future PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adixitconfluent : Thanks for following up on the comment. LGTM
…ctions outside purgatory. (apache#17177) Add purgatory actions to DelayedActionQueue when partition locks are released after fetch in forceComplete. Reviewers: David Arthur <[email protected]>, Apoorv Mittal <[email protected]>, Jun Rao <[email protected]>
About
In reference to comment #16969 (comment) , I have introduced a
DelayedActionQueue
to add purgatory actions and try to complete them.DelayedActionQueue
when partition locks are released after fetch inforceComplete
. Also, code has been added toonExpiration
to check the delayed actions queue and try to complete it. SinceonExpiration
serves as a callback forforceComplete
, it should not lead to infinite call stack.DelayedShareFetchTest
which were occurring due to insufficient mocking.Testing
The code has been tested with the help of unit tests.