Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-17509: Introduce a delayed action queue to complete purgatory actions outside purgatory. #17177

Merged
merged 9 commits into from
Oct 4, 2024

Conversation

adixitconfluent
Copy link
Contributor

@adixitconfluent adixitconfluent commented Sep 12, 2024

About

In reference to comment #16969 (comment) , I have introduced a DelayedActionQueue to add purgatory actions and try to complete them.

  1. I've added code to add purgatory actions to DelayedActionQueue when partition locks are released after fetch in forceComplete. Also, code has been added to onExpiration to check the delayed actions queue and try to complete it. Since onExpiration serves as a callback for forceComplete, it should not lead to infinite call stack.
  2. Also, fixed a few warning in some tests in DelayedShareFetchTestwhich were occurring due to insufficient mocking.

Testing

The code has been tested with the help of unit tests.

@apoorvmittal10 apoorvmittal10 added the KIP-932 Queues for Kafka label Sep 12, 2024
@adixitconfluent
Copy link
Contributor Author

I've checked the 3 test failures. They are unrelated to the PR. I ran all of them locally and they all passed.

Copy link
Member

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @adixitconfluent!

Here's my understanding of the current share fetch handling

  • KafkaApis is calling into SPM to enqueue a share request
  • SPM#maybeProcessFetchQueue runs recursively (! 🙀) until the queue is empty
  • On each iteration, we get a share fetch request off the queue, do some validation and enqueue a DelayedShareFetch

Since adding the DelayedShareFetch to the purgatory is non-blocking, I'm pretty sure we are essentially not using the fetch queue any more. Or rather, we are now using the DelayedShareFetch purgatory as a fetch queue (which was the goal of the refactoring, after all).

For fetchQueue I don't see too many remaining usages:

  • Adding in SPM#fetchMessages (from KafkaApis)
  • completeExceptionally in SPM#close
  • Polling in SPM#maybeProcessFetchQueue

Since this closely matches our DelayedShareFetch usage, I'm wondering if we can remove the fetchQueue code in this PR.

WDYT?

@adixitconfluent
Copy link
Contributor Author

adixitconfluent commented Sep 13, 2024

hi @mumrah,

I'm wondering if we can remove the fetchQueue code in this PR.

You're right, we don't need the fetch queue. I have created a JIRA https://issues.apache.org/jira/browse/KAFKA-17545 for it, and will prioritize it in the coming PRs.

Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adixitconfluent : Thanks for the PR. Added a few comments.

@github-actions github-actions bot added the core Kafka Broker label Sep 30, 2024
@adixitconfluent
Copy link
Contributor Author

Hi @junrao @mumrah , I have responded/changed my code to address your comments. Please take a look when you can, thanks!

Copy link
Collaborator

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 comment for my knowledge.

core/src/main/scala/kafka/server/ReplicaManager.scala Outdated Show resolved Hide resolved
Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adixitconfluent : Thanks for the updated PR. Added one more comment.

@adixitconfluent adixitconfluent requested a review from junrao October 1, 2024 06:42
Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adixitconfluent : Thanks for the updated PR. A couple of more comments.

@adixitconfluent adixitconfluent requested a review from junrao October 3, 2024 05:44
@mumrah
Copy link
Member

mumrah commented Oct 3, 2024

I was still a bit confused as to why we were completing other purgatory items from within the share fetch purgatory, so @adixitconfluent and I sync'd offline. The purpose of this was to allow other delayed share fetches to check for completion in the case of new records being produced. Instead of this, I suggested that we should include a TopicPartition key in addition to the SharPartition key when creating the delayed operation. This would let us tie into the HWM listener so we could directly complete pending share fetches when the HWM increased.

This would let us avoid passing the delayed action queue and purgatory into the delayed share fetch operation. We can keep this PR scoped to adding the delayed action queue and add the produce/HWM callback in a future PR

WDYT @junrao? Does this sound reasonable?

@junrao
Copy link
Contributor

junrao commented Oct 3, 2024

The purpose of this was to allow other delayed share fetches to check for completion in the case of new records being produced.

For share fetch, even when there is no new data in the partition, currently there are situations that we may need to trigger a check on the delayed share fetch.

  1. The share partition lock is freed.
  2. Some of the previously acquired records become available (because of client acknowledgement, ack timeout, client close, etc).

Currently, we need the action queue in delayed share fetch operation for 1. I am not sure that's truly needed. We could revisit that when we add the minBytes support.

When there is new data in the partition, we may also need to trigger a check on the delayed share fetch. This could be done by adding a TopicPartition as the key for the delayed share fetch or somehow map a TopicPartition to an existing SharePartition key.

Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adixitconfluent : Thanks for the updated PR. Left one comment.

core/src/main/scala/kafka/server/KafkaApis.scala Outdated Show resolved Hide resolved
@mumrah
Copy link
Member

mumrah commented Oct 3, 2024

The share partition lock is freed

I'm not sure I understand this comment. AFAIK, we only lock the share partition when acquiring records or otherwise modifying the in-flight records. I think in all these cases (acquiring new records, releasing old records, acquisition timeouts) we have the opportunity to call the purgatory to see if requests are completed.

Basically, I'd like delayed share fetch to be modeled similarly to delayed fetch where the operation only has enough context to complete itself (i.e., fetch params, ReplicaManager, and a few other things) and the calls to complete the operation all happen externally. I think we can achieve this since the completion scenarios all happen in SharePartitionManager or SharePartition.

This could be done by adding a TopicPartition as the key for the delayed share fetch

Yea, this is how I was thinking it would work. Each ShareFetchRequest would have keys for each (topic, partition) and (topic, partition, group).

@adixitconfluent adixitconfluent requested a review from junrao October 3, 2024 18:50
@junrao
Copy link
Contributor

junrao commented Oct 3, 2024

Basically, I'd like delayed share fetch to be modeled similarly to delayed fetch where the operation only has enough context to complete itself (i.e., fetch params, ReplicaManager, and a few other things) and the calls to complete the operation all happen externally. I think we can achieve this since the completion scenarios all happen in SharePartitionManager or SharePartition.

Got it. You are suggesting that instead of adding a delayed action in DelayedShareFetch.onComplete, we can add it in SharePartition.releaseFetchLock. The slight difference is that in the current approach, we can wait for all partitions' lock to be released before adding the delayed action. This potentially allows the woken up delayedShareFetch to grab the lock on more partitions. If we do it inside SharePartition.releaseFetchLock, we may lose that opportunity.

To get rid of delayedAction in DelayedShareFetch, we could potentially add a method in SharePartitionManager that takes a set of partitions and does the following. We can then pass in SharePartitionManager to DelayedShareFetch so that the method can be called there?

                    delayedActionQueue.add(() -> {
                        result.keySet().forEach(topicIdPartition ->
                            delayedShareFetchPurgatory.checkAndComplete(
                                new DelayedShareFetchKey(shareFetchData.groupId(), topicIdPartition)));
                        return BoxedUnit.UNIT;
                    });

Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adixitconfluent : Thanks for the updated PR. One more comment.

delayedShareFetchPurgatory.checkAndComplete(
new DelayedShareFetchKey(shareFetchData.groupId(), topicIdPartition)));
return BoxedUnit.UNIT;
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about moving this logic to SharePartitionManager as discussed in #17177 (comment)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @junrao @mumrah , yes, I think we can move it outside DelayedShareFetch class to SharePartitionManager. Also, I'll also add TopicPartition as a key for delayed share fetch along with SharePartition (that is already present right now). I have compiled these details in JIRA https://issues.apache.org/jira/browse/KAFKA-17703. Will it be fine if we can merge this PR and I'll take up that JIRA in a future PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Sounds good to me.

@adixitconfluent adixitconfluent requested a review from junrao October 4, 2024 05:57
Copy link
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adixitconfluent : Thanks for following up on the comment. LGTM

@junrao junrao merged commit 455c79c into apache:trunk Oct 4, 2024
9 checks passed
tedyu pushed a commit to tedyu/kafka that referenced this pull request Jan 6, 2025
…ctions outside purgatory. (apache#17177)

Add purgatory actions to DelayedActionQueue when partition locks are released after fetch in forceComplete. 

Reviewers: David Arthur <[email protected]>, Apoorv Mittal <[email protected]>, Jun Rao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants