Authentication Concept
This document describes a planned authentication model for the Kafka message bus. It has not been started to be implemented and is subject to review by the Architecture Board.
Objective
The concept defines a Kafka authentication (AuthN) model for the CIVITAS/CORE V2 data platform. It ensures that every client connecting to the message bus proves its identity before any authorization decision is made.
The model must support two deployment scenarios:
- With service mesh -- Kubernetes environments with existing mTLS infrastructure (e.g., Istio, Linkerd) where transport-level encryption and mutual authentication are already provided by the mesh
- Without service mesh -- Environments where Kafka must handle authentication independently, without relying on infrastructure-level mTLS
In both scenarios, the authenticated identity (principal) is passed to the Authorization Concept for topic-level access control via OPA.
Scope and Boundaries
| Aspect | Scope |
|---|---|
| Authentication (AuthN) | Identity verification of Kafka clients -- in scope |
| Authorization (AuthZ) | What an authenticated identity may do on which topic -- out of scope, see Authorization Concept |
| Transport encryption | TLS for data in transit -- in scope as part of AuthN configuration |
| Service mesh configuration | Mesh-level mTLS setup (Istio/Linkerd) -- out of scope, assumed as given |
Client Categories
Kafka clients fall into two categories with different authentication and credential management requirements:
| Category | Description | Examples | Credential Lifecycle |
|---|---|---|---|
| Technical users | Service accounts for platform components and dataset pipelines. Represent the vast majority of clients. Operate unattended. | Config adapters, outbox relay, saga orchestrator, dataset producers/consumers | Automated provisioning and rotation |
| Human users | Administrators and developers accessing Kafka for operations and troubleshooting. Rare, short-lived sessions. | Platform operators, on-call engineers | Personal credentials, interactive authentication |
Authentication Mechanism
Primary: SASL/SCRAM-SHA-512
SASL/SCRAM-SHA-512 is the primary authentication mechanism for all deployment scenarios. It works consistently with and without service mesh and does not depend on PKI infrastructure.
SASL (Simple Authentication and Security Layer, RFC 4422) is a framework that decouples authentication from application protocols — comparable to how OAuth decouples authorization from HTTP: the application protocol (here: Kafka) delegates credential verification to a pluggable SASL mechanism, without needing to implement authentication logic itself.
SCRAM (Salted Challenge Response Authentication Mechanism, RFC 5802) is a challenge-response protocol — comparable to HTTP Digest Authentication vs. Basic Auth: instead of transmitting credentials in cleartext, client and server negotiate a challenge that the client answers using a derived hash of the password. SHA-512 denotes the hash function used for the salted credential derivation.
| Property | Value |
|---|---|
| Kafka listener config | SASL_SSL (without mesh) / SASL_PLAINTEXT (with mesh, TLS terminated by sidecar) |
| SASL mechanism | SCRAM-SHA-512 |
| Credential storage | Kafka's built-in KRaft credential store |
| Principal extraction | SASL username → Kafka principal |
Why SCRAM over PLAIN:
- Credentials are never sent in cleartext (challenge-response protocol)
- Server stores only salted hashes, not passwords
- Supported natively by Kafka without external dependencies
Why SASL/SCRAM as primary over mTLS:
- Works identically in both deployment scenarios (with/without mesh)
- No dependency on certificate infrastructure or PKI
- Simpler credential rotation (username/password change vs. certificate reissue)
- Avoids conflicts with service mesh mTLS (double TLS termination)
Transport Encryption
| Scenario | Transport | Configuration |
|---|---|---|
| Without service mesh | Kafka-native TLS (SASL_SSL) | Kafka brokers terminate TLS directly. Broker certificates required. |
| With service mesh | Mesh-provided mTLS (SASL_PLAINTEXT) | Sidecar proxies handle TLS. Kafka listeners use plaintext; encryption is transparent at the mesh layer. |
In both cases, data in transit is encrypted. The difference is only where TLS termination happens.
Listener Configuration
Kafka brokers expose separate listeners per scenario:
# Broker listener configuration
listeners=SASL_SSL://0.0.0.0:9093,SASL_PLAINTEXT://0.0.0.0:9092
advertised.listeners=SASL_SSL://kafka.example.com:9093,SASL_PLAINTEXT://kafka.internal:9092
# SASL configuration (applies to both listeners)
sasl.enabled.mechanisms=SCRAM-SHA-512
# SSL configuration (for SASL_SSL listener only)
ssl.keystore.location=/etc/kafka/certs/broker.keystore.jks
ssl.keystore.password=${BROKER_KEYSTORE_PASSWORD}
ssl.truststore.location=/etc/kafka/certs/broker.truststore.jks
ssl.truststore.password=${BROKER_TRUSTSTORE_PASSWORD}
Without service mesh: Clients connect to the SASL_SSL listener (port 9093). The Kafka broker handles both authentication (SCRAM) and encryption (TLS).
With service mesh: Clients connect to the SASL_PLAINTEXT listener (port 9092). The mesh sidecar handles encryption (mTLS); Kafka handles only authentication (SCRAM). The SASL_SSL listener can be disabled or restricted to external access.
Credential Management
Technical Users
Technical users fall into two subcategories with distinct provisioning and lifecycle characteristics:
| Subcategory | Description | Provisioned by | Credential Distribution | Lifecycle |
|---|---|---|---|---|
| Component principals | Platform infrastructure components (outbox relay, saga orchestrator, config adapters) | Deployment pipeline | Kubernetes Secrets | Created at deployment, exists as long as the component is deployed |
| Pipeline principals | Dataset-specific producers and consumers for payload data pipelines | Config Adapter | Platform Secrets Management | Created/deleted dynamically when datasets are published or removed |
Component Principals
Component principals represent long-lived platform infrastructure. They are created during deployment and distributed via Kubernetes Secrets.
Lifecycle:
The deployment pipeline creates the SCRAM credential in Kafka and stores the generated password as a Kubernetes Secret in the component's namespace:
apiVersion: v1
kind: Secret
metadata:
name: kafka-credentials
namespace: civitas-platform
type: Opaque
data:
sasl.username: <base64>
sasl.password: <base64>
Components mount this secret and read credentials at startup. Credentials are never hardcoded in application configuration or container images.
Naming convention:
| Principal Type | Naming Pattern | Example |
|---|---|---|
| Config producer | config-<context>-producer | config-outbox-relay-producer, config-saga-orchestrator-producer |
| Config consumer | config-<context>-consumer | config-pipeline-umwelt-consumer |
| Consumer group | cg-<principal-name> | cg-config-pipeline-umwelt-consumer |
Consumer group names are derived from the principal name. A principal may only use consumer groups prefixed with its own name, enforced by the OPA policy.
Pipeline Principals
Pipeline principals are managed dynamically by the responsible Config Adapter. When a dataset is published, the Config Adapter creates the necessary Kafka credentials and reports the result back to the SAGA Orchestrator — analogous to how FROST-Adapters report project creation. The SAGA process then decides how to proceed (e.g., instructing a dedicated adapter to store the credentials in Secrets Management). When a dataset is removed, the Config Adapter deletes the corresponding Kafka credentials.
This separation ensures that the Config Adapter is not directly coupled to the platform-wide Secrets Management. The SAGA Orchestrator coordinates the handover but carries only an opaque reference to the created credentials — never the credentials themselves.
Secrets Handover Invariants:
These invariants apply regardless of the handover mechanism chosen:
- Credentials must not transit the Kafka bus in cleartext
- Credentials must not be persisted in SAGA state — the SAGA carries only an opaque reference (e.g. a Secrets Management path)
- Credentials must be redacted in logs and traces at every component boundary
- The Config Adapter's Secrets Management write token must be scoped to its own path prefix (e.g.
/kafka/dataset/<dataset>/) — no broader write access
Target architecture: The Config Adapter writes credentials directly to Secrets Management and reports only the reference back to the SAGA Orchestrator. The SAGA carries the reference forward to pipeline components.
Interim path (until a platform Secrets Management component is available): Mirror the pattern already in production with the Redpanda Config Adapter — the backend encrypts, the adapter decrypts using a shared key, and only the reference travels via the bus. This keeps the interim consistent with existing platform practice.
Lifecycle:
Credentials are distributed via the platform Secrets Management (not directly via Kubernetes Secrets), which makes them available to pipeline components regardless of their deployment model.
Naming convention:
Each dataset is assigned two separate principals to enforce least-privilege access: the producer holds write access to the dataset's Kafka topic, the consumer holds read access. Separating them ensures that a pipeline component which only reads data cannot accidentally (or maliciously) produce messages.
| Principal Type | Naming Pattern | Example | Access |
|---|---|---|---|
| Dataset producer | dataset-<dataset>-producer | dataset-luftqualitaet-producer | WRITE |
| Dataset consumer | dataset-<dataset>-consumer | dataset-luftqualitaet-consumer | READ |
| Consumer group | cg-<principal-name> | cg-dataset-luftqualitaet-consumer | GROUP |
Credential Rotation
Rotation follows a two-phase approach to avoid downtime:
- Create new credential: A second SCRAM entry is created for the same principal with a new password
- Update credential store: The Kubernetes Secret (component) or Secrets Management entry (pipeline) is updated with the new password
- Rolling restart: The component/pipeline is restarted to pick up the new credential
- Remove old credential: After all instances are running with the new credential, the old SCRAM entry is removed
For component principals, rotation is triggered by the deployment pipeline. For pipeline principals, rotation is triggered by the Config Adapter (e.g., on dataset re-publish or scheduled rotation).
The two-phase rotation approach assumes that a Kafka client can operate with updated credentials after a rolling restart without requiring simultaneous dual-principal support. It must be verified during implementation whether the Kafka client (Java) supports holding two SCRAM credentials for the same principal identifier concurrently within a javax.security.Subject.
An alternative is a retry-based rotation: update the credential in Kafka, distribute it via Secrets Management, and rely on the client's built-in authentication retry mechanism to recover from transient auth failures during the distribution window. This avoids any dual-principal concern entirely at the cost of a brief, self-healing interruption. This approach is simpler but requires that the client implementation treats authentication failures as transient and retries automatically.
The implementation team should evaluate which approach is more appropriate for the specific client implementation.
Human Users
Human users authenticate with personal SCRAM credentials for administrative and troubleshooting access.
| Property | Approach |
|---|---|
| Credential creation | On request by platform operations, tied to a named person |
| Naming convention | admin-<name> (e.g., admin-jschmidt) |
| Principal type | platform-admin (see Authorization Concept) |
| Credential storage | Password manager or short-lived credential — not stored as a Kubernetes Secret (unlike component principals, human credentials have no automated lifecycle) |
| Session expectation | Short-lived, interactive, for debugging or operational tasks |
| Audit | All actions under human user principals are logged (see Authorization Concept, Audit Logging) |
Human user credentials must not be used for automated processes. Any component that runs unattended must use a dedicated technical user.
Relation to Authorization
Authentication provides the principal that the authorization layer operates on:
Client → SASL/SCRAM → Kafka Broker (authenticated principal)
↓
Custom Authorizer → OPA (principal + topic + operation)
↓
allow / deny
The principal name used in SCRAM authentication must match the principal name registered in the authorization database (kafka_principals.principal_name). This mapping is established at credential creation time and enforced by convention (see Naming Convention).
Implementation Steps Overview
| # | Task |
|---|---|
| 1 | Define Kafka listener configuration for both deployment scenarios |
| 2 | Implement credential creation automation in deployment pipeline |
| 3 | Create Kubernetes Secret templates for technical users |
| 4 | Provision initial SCRAM credentials for existing components |
| 5 | Define credential rotation procedure and automation |
| 6 | Document human user credential request process |
| 7 | Validate SASL_PLAINTEXT listener with service mesh mTLS |
Glossary
| Term | Definition |
|---|---|
| AuthN | Authentication -- identity verification |
| mTLS | Mutual TLS -- both client and server present certificates; in this context provided by the service mesh, not by Kafka |
| Principal | Kafka's native term for a technical user identity -- the authenticated identity of a Kafka client. See Authorization Concept for details |
| SASL | Simple Authentication and Security Layer -- framework for authentication in network protocols |
| SCRAM-SHA-512 | Salted Challenge Response Authentication Mechanism -- challenge-response protocol that avoids transmitting passwords in cleartext |
| Service Mesh | Infrastructure layer (e.g., Istio, Linkerd) providing transparent mTLS, traffic management, and observability for Kubernetes workloads |