Skip to content

Conversation

arvi18
Copy link

@arvi18 arvi18 commented Apr 21, 2025

When deleting pending transactional offsets, we must preserve the list
of groups associated with the producer ID, otherwise we cannot clean up
the list of pending transactions for the group once the transaction is
committed or aborted.

Summary by CodeRabbit

  • Bug Fixes

    • Improved offset cleanup: pending transactional offsets now preserve empty group entries while regular offsets allow group removal, preventing premature group deletion.
  • Tests

    • Added a test covering cleanup when pending transactional offsets are deleted before commit, verifying correct tombstone emission and safe group deletion.

When deleting pending transactional offsets, we must preserve the list
of groups associated with the producer ID, otherwise we cannot clean up
the list of pending transactions for the group once the transaction is
committed or aborted.
Copy link

coderabbitai bot commented Apr 21, 2025

Walkthrough

Adds a boolean preserveGroups to the Offsets inner class in OffsetMetadataManager to control whether empty group entries are retained when removing offsets; main offsets use false, pending transactional offsets use true. Also adds a test covering cleanup when pending transactional offsets are deleted before transaction commit.

Changes

Cohort / File(s) Change Summary
Offsets class change
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java
Offsets constructor changed to Offsets(boolean preserveGroups); stores preserveGroups; removal logic now only deletes a group entry when preserveGroups == false and topicOffsets is empty; main offsets initialized with Offsets(false); pending transactional offsets are created with Offsets(true) during replay.
Test addition
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/OffsetMetadataManagerTest.java
New test testCleanupExpiredOffsetsWithDeletedPendingTransactionalOffsets() added to validate cleanup behavior and tombstone emission when pending transactional offsets are deleted before commit.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant OffsetMetadataManager
    participant OffsetsMain as Offsets(false)
    participant OffsetsPending as Offsets(true)
    participant Group

    Client->>OffsetMetadataManager: Commit offset (no producerId)
    OffsetMetadataManager->>OffsetsMain: add/update offset
    Note right of OffsetsMain#gray: preserveGroups = false

    Client->>OffsetMetadataManager: Commit offset (with producerId)
    OffsetMetadataManager->>OffsetsPending: add/update pending transactional offset
    Note right of OffsetsPending#lightblue: preserveGroups = true

    Client->>OffsetMetadataManager: Delete pending transactional offset (tombstone)
    OffsetMetadataManager->>OffsetsPending: remove offset
    OffsetsPending-->>OffsetMetadataManager: retain empty group (preserveGroups=true)

    OffsetMetadataManager->>OffsetMetadataManager: cleanupExpiredOffsets()
    OffsetsMain-->>OffsetMetadataManager: remove group if empty (preserveGroups=false)
    OffsetsPending-->>OffsetMetadataManager: keep group until transaction marker cleared

    OffsetMetadataManager->>Group: decide group deletion / emit tombstones
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

In burrows of code I quietly tread,
A flag to remember the groups we have led.
Transactions that linger I gently retain,
Till tombstones are signed and all is unclained.
The rabbit hops proud — tests prove it was well-bred. 🐇

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch clone-squah-fix-group-coordinator-not-deleting-groups-with-deleted-pending-transactional-offsets

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@visz11
Copy link
Collaborator

visz11 commented Apr 21, 2025

@coderabbitai full review

Copy link

coderabbitai bot commented Apr 21, 2025

✅ Actions performed

Full review triggered.

@visz11
Copy link
Collaborator

visz11 commented Apr 23, 2025

@coderabbitai full review

Copy link

coderabbitai bot commented Apr 23, 2025

✅ Actions performed

Full review triggered.

@visz11
Copy link
Collaborator

visz11 commented Apr 24, 2025

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a mechanism to preserve group information when deleting transactional offsets, which is crucial for cleaning up pending transactions. The changes seem well-structured and include a new test case to validate the cleanup behavior. Overall, the changes look good, but I have a few suggestions for improvement.

Summary of Findings

  • Clarity of 'preserveGroups' usage: The purpose of the preserveGroups flag could be made more explicit in the Offsets class constructor to improve readability.
  • Test case description: The test case testCleanupExpiredOffsetsWithDeletedPendingTransactionalOffsets could benefit from a more descriptive name or additional comments to clarify its purpose.

Merge Readiness

The changes appear to be well-implemented and include a new test case. However, addressing the comments regarding the clarity of preserveGroups and the test case description would further improve the quality of the code. I am unable to directly approve the pull request, and recommend that others review and approve this code before merging. I recommend that the pull request not be merged until the high severity issues are addressed (at a minimum).

@visz11
Copy link
Collaborator

visz11 commented Jul 1, 2025

/do-refacto

Copy link

refacto-test bot commented Jul 1, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

@arvi18
Copy link
Author

arvi18 commented Sep 2, 2025

/refacto-test

Copy link

refacto-test bot commented Sep 2, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

Copy link

refacto-test bot commented Sep 2, 2025

Code Review: Transaction Management in Offset Handling

👍 Well Done
Group Tracking Improved

Preserving groups with pending transactions prevents premature group deletion.

Consistent Error Handling

Test case properly validates error handling for deleted pending transactional offsets.

📌 Files Processed
  • group-coordinator/src/test/java/org/apache/kafka/coordinator/group/OffsetMetadataManagerTest.java
  • group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java
📝 Additional Comments
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java (3)
Transaction Tracking Enhancement

Removing empty groups loses tracking of pending transactions. This could lead to inconsistent transaction state and potential data loss when transactions are committed or aborted.

Standards:

  • CWE-667
  • OWASP-A04
Group Preservation Flag

The new preserveGroups flag introduces a conditional check in the remove() method that potentially keeps empty group entries in memory. While this fixes the transaction handling issue, it could lead to increased memory usage over time if many transactional groups are created but never fully cleaned up. Consider implementing a periodic cleanup mechanism for groups that have been preserved but are no longer associated with any active transactions.

Standards:

  • ISO-IEC-25010-Performance-Resource-Utilization
  • Algorithm-Opt-Memory-Management
Boolean Parameter Flag

Boolean parameter flags can reduce code clarity. Consider using descriptive factory methods or builder pattern to make the intention clearer when constructing Offsets objects.

Standards:

  • Clean-Code-Functions
  • Design-Pattern-Builder

this.config = config;
this.metrics = metrics;
this.offsets = new Offsets();
this.offsets = new Offsets(false);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Constructor Parameter

The Offsets constructor is modified to take a boolean parameter, but the main offsets instance is initialized with the parameterless constructor which no longer exists. This would cause compilation failure since the constructor signature has changed.

Standards
  • Algorithm-Correctness-Parameter-Validation
  • Logic-Verification-Constructor-Consistency

Comment on lines +212 to 214
private Offsets(boolean preserveGroups) {
this.preserveGroups = preserveGroups;
this.offsetsByGroup = new TimelineHashMap<>(snapshotRegistry, 0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uninitialized Flag Parameter

The Offsets constructor was modified to accept a preserveGroups parameter but the original constructor call at line 288 passes false without explanation. This could lead to inconsistent group preservation behavior.

Standards
  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • ISO-IEC-25010-Reliability-Fault-Tolerance

Comment on lines +201 to 214
/**
* Whether to preserve empty entries for groups when removing offsets.
* We use this to keep track of the groups associated with pending transactions.
*/
private final boolean preserveGroups;

/**
* The offsets keyed by group id, topic name and partition id.
*/
private final TimelineHashMap<String, TimelineHashMap<String, TimelineHashMap<Integer, OffsetAndMetadata>>> offsetsByGroup;

private Offsets() {
private Offsets(boolean preserveGroups) {
this.preserveGroups = preserveGroups;
this.offsetsByGroup = new TimelineHashMap<>(snapshotRegistry, 0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insufficient Constructor Documentation

The constructor parameter is documented but lacks explanation of when to use true/false values and their implications. This makes future modifications risky as developers may not understand the full impact of this parameter.

Standards
  • Clean-Code-Documentation
  • Clean-Code-Naming

@arvi18
Copy link
Author

arvi18 commented Sep 2, 2025

@coderabbitai full review

Copy link

coderabbitai bot commented Sep 2, 2025

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java (1)

859-862: Bug: cleanupExpiredOffsets may delete a group with open transactions when it has no committed offsets

Early return ignores openTransactionsByGroup. This contradicts the later guard and can incorrectly mark the group deletable while a transaction is still open.

Apply:

-        if (offsetsByTopic == null) {
-            return true;
-        }
+        if (offsetsByTopic == null) {
+            // No committed offsets left; only deletable if there are no open transactions.
+            return !openTransactionsByGroup.containsKey(groupId);
+        }

Consider adding a unit test (see suggested test in the test file review).

🧹 Nitpick comments (2)
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/OffsetMetadataManagerTest.java (1)

2596-2656: Add a test where the group has only pending offsets (no committed ones)

This would catch an edge case where a group with zero committed offsets but open transactions should not be considered deletable by cleanup. Suggest adding a test that: creates only a transactional commit, asserts cleanupExpiredOffsets(...) returns false, then commits/aborts the txn and asserts it returns true.

Apply this test diff near the existing tests:

+    @Test
+    public void testCleanupExpiredOffsetsWithOnlyPendingTransactionalOffsets() {
+        GroupMetadataManager groupMetadataManager = mock(GroupMetadataManager.class);
+        Group group = mock(Group.class);
+        OffsetMetadataManagerTestContext context = new OffsetMetadataManagerTestContext.Builder()
+            .withGroupMetadataManager(groupMetadataManager)
+            .withOffsetsRetentionMinutes(1)
+            .build();
+
+        long t0 = context.time.milliseconds();
+        // Only a pending transactional offset (no committed offsets).
+        context.commitOffset(10L, "group-id", "foo", 0, 101L, 0, t0 + 10);
+
+        when(groupMetadataManager.group("group-id")).thenReturn(group);
+        when(group.offsetExpirationCondition()).thenReturn(Optional.of(
+            new OffsetExpirationConditionImpl(o -> o.commitTimestampMs)));
+        when(group.isSubscribedToTopic("foo")).thenReturn(false);
+
+        context.time.sleep(Duration.ofMinutes(1).toMillis());
+
+        // Should not be deletable while the transaction is open.
+        List<CoordinatorRecord> records = new ArrayList<>();
+        assertFalse(context.cleanupExpiredOffsets("group-id", records));
+        assertEquals(List.of(), records);
+
+        // Once the transaction ends, the group becomes deletable.
+        context.replayEndTransactionMarker(10L, TransactionResult.ABORT);
+        assertTrue(context.cleanupExpiredOffsets("group-id", new ArrayList<>()));
+    }
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java (1)

201-206: Constructor flag is clear; add brief usage doc for true/false

The preserveGroups intent is good. A short note on where each mode is used (main store=false, pending txns=true) will help future maintainers.

     private class Offsets {
-        /**
-         * Whether to preserve empty entries for groups when removing offsets.
-         * We use this to keep track of the groups associated with pending transactions.
-         */
+        /**
+         * Whether to preserve empty entries for groups when removing offsets.
+         * Usage:
+         * - preserveGroups=false for the main committed offsets store (prune empty groups).
+         * - preserveGroups=true for pending transactional offsets (retain empty groups so we
+         *   can still track open transactions per group until the txn end marker arrives).
+         */
         private final boolean preserveGroups;
@@
-        private Offsets(boolean preserveGroups) {
+        private Offsets(boolean preserveGroups) {
             this.preserveGroups = preserveGroups;
             this.offsetsByGroup = new TimelineHashMap<>(snapshotRegistry, 0);
         }

Also applies to: 212-215

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ed79212 and d670f68.

📒 Files selected for processing (2)
  • group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java (4 hunks)
  • group-coordinator/src/test/java/org/apache/kafka/coordinator/group/OffsetMetadataManagerTest.java (1 hunks)
🔇 Additional comments (4)
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/OffsetMetadataManagerTest.java (1)

2596-2656: Good regression coverage for deleted pending transactional offsets

The scenario is well modeled and ensures the group isn't deletable while the transaction is still open, even after the pending offset tombstone is written.

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java (3)

266-267: Correct: avoid removing empty group entries for pending transactional offsets

This is the key to keep group->producer linkage after tombstoning pending offsets.


288-288: Main store initialization with preserveGroups=false is appropriate

Maintains prior pruning behavior for committed offsets.


1005-1006: Pending store initialization with preserveGroups=true is appropriate

Ensures group association survives offset deletions until txn end.

@visz11
Copy link
Collaborator

visz11 commented Sep 11, 2025

/refacto-test

Copy link

refacto-test bot commented Sep 11, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

Copy link

refacto-test bot commented Sep 11, 2025

Code Review: Transactional Offset Management

👍 Well Done
Group Tracking Enhancement

Preserves groups with pending transactions when deleting offsets.

Transaction Tracking

Preventing premature group deletion improves transaction consistency.

Comprehensive Test Coverage

New test case thoroughly validates transactional offset deletion behavior.

📌 Files Processed
  • group-coordinator/src/test/java/org/apache/kafka/coordinator/group/OffsetMetadataManagerTest.java
  • group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java
📝 Additional Comments
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java (5)
Conditional Removal Logic

The conditional logic modification adds complexity that could be extracted into a method with intention-revealing name. This would improve readability and make the preservation logic more explicit.

            if (shouldRemoveGroup(preserveGroups, topicOffsets))
                offsetsByGroup.remove(groupId);

Standards:

  • Clean-Code-Functions
  • SOLID-SRP
Transaction Completion Cleanup

When transactions complete, consider adding logic to clean up empty group entries that were preserved during the transaction. This would prevent memory growth over time as transactional offsets are deleted but their group entries remain.

Standards:

  • ISO-IEC-25010-Performance-Resource-Utilization
  • Algorithm-Opt-Memory-Management
Factory Method Opportunity

Boolean literal in constructor call reduces readability. Consider introducing factory methods like createPreservingGroups() and createStandardOffsets() to improve code clarity and intent.

Standards:

  • Design-Pattern-Factory
  • Clean-Code-Meaningful-Names
Transaction State Management

Consider adding timeout mechanism for preserved groups. Long-running or abandoned transactions could leave orphaned group entries consuming memory indefinitely.

Standards:

  • CWE-404
  • OWASP-A04
Document Preservation Parameter

Documentation explains what the parameter does but not why different values are used. Could lead to future reliability issues if modified incorrectly.

Standards:

  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • ISO-IEC-25010-Reliability-Maturity

this.config = config;
this.metrics = metrics;
this.offsets = new Offsets();
this.offsets = new Offsets(false);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Group Preservation

Main offsets store initialized with preserveGroups=false while pending transactional offsets use true. Inconsistent behavior could lead to premature group deletion.

Suggested change
this.offsets = new Offsets(false);
this.offsets = new Offsets(true);
Standards
  • ISO-IEC-25010-Reliability-Fault-Tolerance
  • ISO-IEC-25010-Functional-Correctness-Appropriateness

Comment on lines +266 to 267
if (!preserveGroups && topicOffsets.isEmpty())
offsetsByGroup.remove(groupId);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group Preservation Overhead

The preserveGroups flag controls whether empty group entries are retained in memory. For transactional offsets, this causes unnecessary memory retention as groups with deleted offsets remain in the map until transaction completion.

Suggested change
if (!preserveGroups && topicOffsets.isEmpty())
offsetsByGroup.remove(groupId);
if (!preserveGroups && topicOffsets.isEmpty())
offsetsByGroup.remove(groupId);
else if (preserveGroups && topicOffsets.isEmpty() && !openTransactionsByGroup.containsKey(groupId))
offsetsByGroup.remove(groupId);
Standards
  • ISO-IEC-25010-Performance-Resource-Utilization
  • Algorithm-Opt-Memory-Management

Comment on lines +266 to 267
if (!preserveGroups && topicOffsets.isEmpty())
offsetsByGroup.remove(groupId);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group Preservation Logic

The original code incorrectly removed groups with empty topic offsets even when they had pending transactions. This could cause premature group deletion and data inconsistency when transactions complete later.

Suggested change
if (!preserveGroups && topicOffsets.isEmpty())
offsetsByGroup.remove(groupId);
if (!preserveGroups && topicOffsets.isEmpty())
offsetsByGroup.remove(groupId);
Standards
  • Business-Rule-Validation
  • Logic-Verification-Completeness
  • Algorithm-Correctness-State-Preservation

Comment on lines +212 to 214
private Offsets(boolean preserveGroups) {
this.preserveGroups = preserveGroups;
this.offsetsByGroup = new TimelineHashMap<>(snapshotRegistry, 0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boolean Parameter Clarity

Boolean parameter lacks semantic clarity, making intention hard to understand from call sites. Consider using a more descriptive enum or builder pattern to make the parameter's purpose self-documenting.

Standards
  • Clean-Code-Meaningful-Names
  • Clean-Code-Functions

private final TimelineHashMap<String, TimelineHashMap<String, TimelineHashMap<Integer, OffsetAndMetadata>>> offsetsByGroup;

private Offsets() {
private Offsets(boolean preserveGroups) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential Resource Exhaustion

Preserving empty group entries indefinitely could lead to memory exhaustion. Attackers could create numerous groups with pending transactions, depleting coordinator resources.

Suggested change
private Offsets(boolean preserveGroups) {
this.preserveGroups = preserveGroups;
// TODO: Add monitoring and limits for preserved empty groups to prevent resource exhaustion
Standards
  • CWE-400
  • OWASP-A04

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants