Skip to content

Kafka node offline causes consumer group queries to fail entirely #2327

@cobolbaby

Description

@cobolbaby

Description
When a single Kafka broker/node goes offline, querying consumer group information results in complete failure instead of partial degradation or graceful handling.

Steps to Reproduce

  1. Set up a Kafka cluster with multiple brokers.
  2. Create a topic and a consumer group consuming from it.
  3. Shut down one Kafka broker (simulate node failure).
  4. Run consumer group query commands

Expected Behavior

  • Consumer group queries should still return available information from remaining brokers.

Actual Behavior

  • All consumer group queries fail when one broker is offline.
Image
{
    "statusCode": 404,
    "message": "Failed to get consumer group lags: failed to list end offsets for topics: request ListOffsets has 1 separate shard errors, first: LISTENER_NOT_FOUND: There is no listener on the leader broker that matches the listener on which metadata request was processed."
}

Impact

  • Affects observability and troubleshooting during partial cluster failures.
  • Makes it difficult to monitor lag and consumer status in degraded scenarios.

Environment

  • Kafka version: 2.x / 3.x
  • Deployment mode: Kubernetes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions