Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6744 - Fix Kafka TLS configuration with plaintext authentication #6764

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

amilbcahat
Copy link

@amilbcahat amilbcahat commented Feb 21, 2025

Which problem is this PR solving?

Description of the changes

  • Fixed TLS configuration to work with plaintext authentication when TLS is enabled
  • Modified TLS configuration logic in Kafka authentication to support SASL-SSL with PLAIN
  • Restored behavior to enable TLS regardless of authentication method when TLS is configured
  • Fixed regression introduced in PR [kafka] OTEL helper instead of tlscfg package #6270

How was this change tested?

  • Verified connectivity with TLS-enabled Kafka cluster using plaintext authentication
  • Tested both collector and ingester components
  • Validated SASL-SSL with PLAIN authentication works as expected

Checklist

Amol Verma added 2 commits February 21, 2025 05:30
This change fixes the Kafka TLS configuration to work correctly when tls.enabled
flag is not provided but authentication=tls is set. Previously, TLS would not
be enabled in this case.

Changes:
- TLS is now properly configured when authentication=tls, regardless of tls.enabled
- Maintains backward compatibility with existing tls.enabled flag
- Sets explicit insecure mode only when TLS is intentionally disabled

Testing:
- Added unit tests for TLS configuration scenarios
- Verified with local Kafka cluster using TLS authentication
- Tested with HotROD example application

Resolves jaegertracing#6744

Signed-off-by: Amol Verma <[email protected]>
This change fixes the Kafka TLS configuration to work correctly when using
plaintext authentication with TLS enabled. Previously, TLS would only be
configured when authentication=tls, breaking SASL-SSL with PLAIN authentication.

Changes:
- Modified TLS configuration logic to support TLS with other authentication methods
- Fixed SASL-SSL with PLAIN authentication scenario
- Maintained backward compatibility with existing authentication methods
- Restored pre-PR-6270 behavior for TLS configuration

Resolves jaegertracing#6744

Signed-off-by: Amol Verma <[email protected]>
@amilbcahat amilbcahat requested a review from a team as a code owner February 21, 2025 00:15
Copy link

codecov bot commented Feb 21, 2025

Codecov Report

Attention: Patch coverage is 41.66667% with 7 lines in your changes missing coverage. Please review.

Project coverage is 49.05%. Comparing base (84212d2) to head (0bdd1a8).

Files with missing lines Patch % Lines
pkg/kafka/auth/tls.go 0.00% 5 Missing ⚠️
pkg/kafka/auth/config.go 71.42% 0 Missing and 2 partials ⚠️

❗ There is a different number of reports uploaded between BASE (84212d2) and HEAD (0bdd1a8). Click for more details.

HEAD has 23 uploads less than BASE
Flag BASE (84212d2) HEAD (0bdd1a8)
badger_v1 2 1
kafka-3.x-v1 2 1
memory_v2 2 1
badger_v2 2 1
tailsampling-processor 2 1
grpc_v2 2 1
grpc_v1 2 1
kafka-3.x-v2 2 1
elasticsearch-6.x-v1 2 1
elasticsearch-7.x-v1 2 1
cassandra-4.x-v1-manual 2 1
opensearch-2.x-v1 2 1
cassandra-5.x-v1-manual 2 1
opensearch-1.x-v1 2 1
elasticsearch-8.x-v1 2 1
opensearch-2.x-v2 2 1
elasticsearch-8.x-v2 2 1
unittests 2 0
cassandra-4.x-v2-auto 2 1
cassandra-5.x-v2-auto 2 1
cassandra-4.x-v2-manual 2 1
cassandra-5.x-v2-manual 2 1
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #6764       +/-   ##
===========================================
- Coverage   96.04%   49.05%   -47.00%     
===========================================
  Files         364      177      -187     
  Lines       20692    10697     -9995     
===========================================
- Hits        19874     5247    -14627     
- Misses        624     5056     +4432     
- Partials      194      394      +200     
Flag Coverage Δ
badger_v1 9.76% <0.00%> (-0.01%) ⬇️
badger_v2 1.82% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1-manual 14.81% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-auto 1.81% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-manual 1.81% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v1-manual 14.81% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-auto 1.81% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-manual 1.81% <0.00%> (-0.01%) ⬇️
elasticsearch-6.x-v1 19.15% <0.00%> (-0.01%) ⬇️
elasticsearch-7.x-v1 19.23% <0.00%> (-0.01%) ⬇️
elasticsearch-8.x-v1 19.40% <0.00%> (-0.01%) ⬇️
elasticsearch-8.x-v2 1.82% <0.00%> (-0.01%) ⬇️
grpc_v1 10.81% <0.00%> (-0.01%) ⬇️
grpc_v2 7.80% <0.00%> (-0.01%) ⬇️
kafka-3.x-v1 10.19% <41.66%> (+0.05%) ⬆️
kafka-3.x-v2 1.82% <0.00%> (-0.01%) ⬇️
memory_v2 1.82% <0.00%> (-0.01%) ⬇️
opensearch-1.x-v1 19.28% <0.00%> (-0.01%) ⬇️
opensearch-2.x-v1 19.28% <0.00%> (-0.01%) ⬇️
opensearch-2.x-v2 1.82% <0.00%> (-0.01%) ⬇️
tailsampling-processor 0.48% <0.00%> (-0.01%) ⬇️
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yurishkuro
Copy link
Member

what are you trying to achieve by merging main? It erases the CI checks which clearly show that your PR does not pass the linter.

Signed-off-by: Amol Verma <[email protected]>
@amilbcahat amilbcahat force-pushed the fix-kafka-tls-plaintext-6744 branch from c2267d2 to 2b355d0 Compare February 21, 2025 23:42
@amilbcahat
Copy link
Author

what are you trying to achieve by merging main? It erases the CI checks which clearly show that your PR does not pass the linter.

I was updating the branch to latest, just that. I have committed for Lint checks , now. Can you check again ?

Amol Verma added 3 commits February 22, 2025 05:18
This change fixes the Kafka TLS configuration to work correctly when using
plaintext authentication with TLS enabled. Previously, TLS would only be
configured when authentication=tls, breaking SASL-SSL with PLAIN authentication.

Resolves jaegertracing#6744

Signed-off-by: Amol Verma <[email protected]>
- Fix TLS configuration initialization for Kafka auth
- Add proper handling of system CA certs pool
- Set secure defaults for TLS configuration
- Remove redundant code comments

Signed-off-by: Amol Verma <[email protected]>
Signed-off-by: Amol Verma <[email protected]>
- Fix TLS configuration initialization for Kafka auth
- Add proper handling of system CA certs pool
- Set secure defaults for TLS configuration
- Remove redundant code comments

Signed-off-by: Amol Verma <[email protected]>
Signed-off-by: Amol Verma <[email protected]>
@amilbcahat
Copy link
Author

amilbcahat commented Feb 23, 2025

I have made the corrections for Unit Tests, can you update the PR label please ? @yurishkuro and run it again, I dont have necessary permissions to add the label I think.

tlsClientConfig := tlscfg.ClientFlagsConfig{
Prefix: configPrefix,
}
tlsCfg, err := tlsClientConfig.InitFromViper(v)
if err != nil {
return fmt.Errorf("failed to process Kafka TLS options: %w", err)
}
tlsCfg.IncludeSystemCACertsPool = (config.Authentication == tls)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to maintain the security model difference between TLS authentication and TLS encryption:

  • When using TLS authentication (auth="tls"), we need system CA certs to validate client certificates
  • When using TLS encryption with SASL PLAIN auth, we don't need system CA certs
    The unit tests specifically verify this distinction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using TLS encryption with SASL PLAIN auth, we don't need system CA certs

why? The only time you don't need system certs is if you are providing your own. Am I wrong about that?

Copy link
Author

@amilbcahat amilbcahat Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. Both modes need CA certs for TLS validation, but they use different sources:

  • TLS auth (authentication="tls"):

    • IncludeSystemCACertsPool=true to validate client/server certificates using system CA pool
    • Uses mutual TLS where both authenticate each other
  • SASL PLAIN with TLS:

    • IncludeSystemCACertsPool=false because it validates the server certificate using explicitly provided CA cert
    • Client authentication happens via username/password

The tests expect this specific behavior to enforce these different trust models, without this distinction tests fails. Both approaches provide certificate validation, just from different trust sources.

tlsClientConfig := tlscfg.ClientFlagsConfig{
Prefix: configPrefix,
}
tlsCfg, err := tlsClientConfig.InitFromViper(v)
if err != nil {
return fmt.Errorf("failed to process Kafka TLS options: %w", err)
}
tlsCfg.IncludeSystemCACertsPool = (config.Authentication == tls)
tlsCfg.Insecure = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this already being set via .tls.enabled?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While .tls.enabled sets up TLS configuration, we still need to explicitly ensure Insecure=false for all TLS usage. Without this line, the tests fail because they expect Insecure to be false when TLS auth is used.

@yurishkuro
Copy link
Member

what is the testing procedure for this change? How do we know it does what's needed?

@amilbcahat
Copy link
Author

what is the testing procedure for this change? How do we know it does what's needed?

To verify this fix works, I've set up a test environment with:

  1. Kafka container configured for SASL_SSL:
    • Uses TLS encryption with SASL PLAIN authentication

Docker Image configuration used -

version: "3"
services:
  zookeeper:
    image: bitnami/zookeeper:latest
    ports:
      - "2181:2181"
    environment:
      - ALLOW_ANONYMOUS_LOGIN=yes
  kafka:
    image: bitnami/kafka:3.7.0
    ports:
      - "9092:9092"
      - "29093:29093"
    environment:
      - KAFKA_BROKER_ID=1
      - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
      - KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=INTERNAL:PLAINTEXT,EXTERNAL:SASL_SSL
      - KAFKA_CFG_LISTENERS=internal://0.0.0.0:9092,external://0.0.0.0:29093
      - KAFKA_CFG_ADVERTISED_LISTENERS=internal://kafka:9092,external://localhost:29093
      - KAFKA_INTER_BROKER_LISTENER_NAME=INTERNAL
      - KAFKA_TLS_TYPE=JKS
      - KAFKA_CFG_SSL_KEYSTORE_LOCATION=/opt/bitnami/kafka/config/certs/kafka.keystore.jks
      - KAFKA_CFG_SSL_KEYSTORE_PASSWORD=kafkapass123
      - KAFKA_CFG_SSL_KEY_PASSWORD=kafkapass123
      - KAFKA_CFG_SSL_TRUSTSTORE_LOCATION=/opt/bitnami/kafka/config/certs/kafka.truststore.jks
      - KAFKA_CFG_SSL_TRUSTSTORE_PASSWORD=kafkapass123
      - KAFKA_CFG_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM=
      - KAFKA_CFG_SASL_ENABLED_MECHANISMS=PLAIN
      - KAFKA_CFG_SASL_MECHANISM_INTER_BROKER_PROTOCOL=PLAIN
      - KAFKA_CLIENT_USERS=admin
      - KAFKA_CLIENT_PASSWORDS=admin-secret
    volumes:
      - ./certs:/opt/bitnami/kafka/config/certs
    depends_on:
      - zookeeper
  • Exposes port 29093 for SASL_SSL connections
  • Configured with admin/admin-secret credentials locally
  1. Test with ingester accessing Kafka:
   ./ingester \
     --kafka.consumer.brokers=localhost:29093 \
     --kafka.consumer.topic=jaeger-spans \
     --kafka.consumer.authentication=plaintext \
     --kafka.consumer.plaintext.username=admin \
     --kafka.consumer.plaintext.password=admin-secret \
     --kafka.consumer.plaintext.mechanism=PLAIN \
     --kafka.consumer.group-id=jaeger-ingester \
     --kafka.consumer.tls.enabled=true \
     --kafka.consumer.tls.ca=./certs/ca.crt \
     --kafka.consumer.tls.skip-host-verify=true

Verified with standard Kafka clients (producer/consumer) that the connection works with the same settings

Without the fix in PR #6764, the ingester command would fail because TLS settings weren't properly applied when using SASL PLAIN authentication. With the fix, the connection succeeds. Do you need any more configuration related information used for testing here ?

@yurishkuro
Copy link
Member

To verify this fix works, I've set up a test environment with:

Is this something we can add to internal/storage/integration/kafka_test.go?

@amilbcahat
Copy link
Author

To verify this fix works, I've set up a test environment with:

Is this something we can add to internal/storage/integration/kafka_test.go?

Yes, we can add an integration test in internal/storage/integration/kafka_test.go to verify this scenario works properly. The test would:

  1. Set up a Kafka client with:

    • SASL PLAIN authentication
    • TLS enabled for encryption
    • Connected to a Kafka broker with SASL_SSL listener
  2. Verify the client can successfully connect and produce/consume messages

This would formalize the manual test case I've been using to verify the fix. Would you like me to implement this as part of the PR?

@yurishkuro
Copy link
Member

Yes, I prefer the tests to be part of the PR. However, is it possible to configure a single instance of Kafka to work with different auth-n methods, or do we need to spin different Kafka container for each auth flavor? The latter is much more expensive to run in the CI.

@amilbcahat
Copy link
Author

Yes, I prefer the tests to be part of the PR. However, is it possible to configure a single instance of Kafka to work with different auth-n methods, or do we need to spin different Kafka container for each auth flavor? The latter is much more expensive to run in the CI.

Yes, this is possible. We can open different listeners in the same Kafka instance to work with different types of Authentication/security protocols. Should I go ahead with implementation ?

ref: https://developer.confluent.io/courses/security/authentication-basics/#:~:text=Configuring%20Authentication%3A%20Listeners%20and%20Security%20Protocols&text=Essentially%2C%20when%20configuring%20the%20broker,authenticate%2C%20whether%20SSL%20or%20SASL_SSL.

@yurishkuro
Copy link
Member

yes, sounds good. One broker/container, multiple listeners, multiple tests using different ports.

I would recommend not running a full test suite against each listener, only some basic write/read tests.

@amilbcahat
Copy link
Author

yes, sounds good. One broker/container, multiple listeners, multiple tests using different ports.

I would recommend not running a full test suite against each listener, only some basic write/read tests.

Sure, I will keep in mind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: kafka: cannot connect to TLS kafka with TLS + plaintext
2 participants