Skip to main content

ADR 027: Credential Management for External Datasource Credentials

Date: 2026-02-05

Status: Proposed

Decision Makers: Architecture Board

Context

CIVITAS/CORE v2 data pipelines need to access external databases and services that require authentication credentials (usernames, passwords, API keys, client certificates, etc.). These credentials must be stored securely and made available to pipeline components at runtime.

Key requirements:

  • Credentials must never be stored in plaintext in the CIVITAS/CORE database, configuration files, or the message bus.
  • Credentials must be retrievable at runtime by pipeline components without exposing them to unnecessary intermediaries.
  • Credential rotation must be supported without requiring platform-wide restarts.
  • The solution must be agnostic to the specific pipeline technology (currently Redpanda Connect, but subject to change per modularity principles).
  • Multi-tenancy must be respected: pipeline components must only access credentials they are authorized for (principle of least privilege).

Currently, no standardized approach exists for managing these secrets within the platform. This ADR establishes the architectural pattern for the portal/backend side of credential management — i.e., how credentials are stored, referenced, and resolved.

Checked Architecture Principles

  • [full] Model-centric data flow — Datasource metadata (including Vault references) is managed as part of the platform's data model; credentials are treated as a linked concern.
  • [full] Distributed architecture with unified user experience — Credential management is transparent to the end user via the Data Management Portal; complexity is hidden behind the ConfigAdapter.
  • [full] Modular design — Secrets backend (Vault) is decoupled from pipeline technology; the ConfigAdapter serves as the sole integration point.
  • [full] Integration capability through defined interfaces — Vault access follows a defined path convention; pipeline components receive credentials through standardized mechanisms (environment variables, files, or API injection).
  • [full] Open source as the default — HashiCorp Vault (BSL) or OpenBao (OSS fork, API-compatible) are used as the secrets backend.
  • [full] Cloud-native architecture — Vault integrates natively with Kubernetes via Service Account authentication and the Vault Agent Sidecar Injector.
  • [full] Prefer standard solutions over custom development — Vault is the industry-standard secrets management solution; no custom encryption or secret storage is implemented.
  • [full] Self-contained deployment — Vault is deployed as part of the platform stack; no external SaaS dependency is introduced.
  • [full] Technological consistency to ensure maintainability — Vault integration in the ConfigAdapter uses Spring Cloud Vault, consistent with the existing Spring Boot stack.
  • [full] Multi-tenancy — Vault policies enforce per-pipeline access control; each pipeline's Service Account can only read the datasource credentials it is configured to use.
  • [full] Security by design — Credentials are encrypted at rest in Vault, transmitted only over mTLS within the cluster, and never persisted in the CIVITAS database or on the message bus.

Decision

Secrets Backend: HashiCorp Vault (or OpenBao)

HashiCorp Vault is adopted as the central secrets backend for all external datasource credentials. OpenBao may be used as a drop-in replacement if the BSL license is not acceptable for a given deployment.

Separation of Metadata and Secrets

The CIVITAS/CORE database stores datasource configuration metadata (endpoint, type, schema mapping, connection parameters, etc.) together with a reference to the Vault path — but never the secret itself. The actual credentials are stored exclusively in Vault.

Vault Path Convention

Credentials are stored under a deterministic path derived from the datasource's UUID:

civitas/datasources/{datasource-uuid}

Each secret contains structured fields (e.g., host, port, username, password, tls_cert, etc.) as key-value pairs within the Vault KV v2 secret engine.

ConfigAdapter as Secret-Aware Orchestrator

The ConfigAdapter (Java/Spring Boot component, see ADR 013 and ADR 015) is the sole component that resolves Vault references into concrete credentials. The flow is:

  1. Datasource creation: An administrator creates a datasource in the Data Management Portal. The portal backend stores the metadata in the CIVITAS database and writes the credentials to Vault under the conventional path. A reference (the datasource UUID) links the two.

  2. Pipeline creation: When a pipeline is created that references one or more datasources, a CloudEvent is published to the message bus (see ADR 013).

  3. Credential resolution: The ConfigAdapter receives the CloudEvent, reads the pipeline configuration (including datasource UUIDs) from the CIVITAS database, resolves the corresponding secrets from Vault, and passes the fully assembled connection configuration to the pipeline runtime component.

  4. Credential rotation: When a credential is rotated in Vault (either manually or via Vault's dynamic secrets engine), a CloudEvent is published. The ConfigAdapter resolves the new credential and updates the affected pipeline's runtime configuration without restarting unrelated pipelines.

Vault Authentication

The ConfigAdapter authenticates to Vault using the Kubernetes Auth Method. The ConfigAdapter's Kubernetes Service Account is bound to a Vault policy that grants read access to the civitas/datasources/* path. No long-lived Vault tokens are stored; authentication is based on short-lived Kubernetes Service Account tokens.

Security Constraints

  • Credentials never transit the message bus. CloudEvents contain only datasource/pipeline UUIDs, never secret material.
  • The ConfigAdapter holds credentials only transiently in memory during the resolution and injection process. Credentials are not written to disk or logged.
  • The Vault API and the pipeline runtime API (used for credential injection) must be accessed over mTLS within the cluster. Network policies must restrict access to the pipeline runtime API to the ConfigAdapter pod exclusively.
  • Debug endpoints on the pipeline runtime must be disabled in production to prevent credential exfiltration via configuration introspection.

Informational: Pipeline Runtime Integration (Redpanda Connect — Tentative)

Note: The following describes a tentative approach for the current pipeline runtime (Redpanda Connect). This is subject to further evaluation and is not part of the core architectural decision. The backend-side patterns described above are designed to be agnostic to the pipeline runtime technology.

Assumption: Redpanda Connect is operated in Streams Mode, which allows multiple pipelines to run within a single process, each managed independently.

  • Pipeline creation: The ConfigAdapter generates the Redpanda Connect pipeline YAML and injects credentials at startup using the --set flag with Vault CLI backtick expansion:

    rpk connect run ./pipeline.yaml \
    --set "input.sql_select.dsn=`vault kv get -field=dsn civitas/datasources/{uuid}`"

    This resolves the secret at process start time via shell expansion, avoiding environment variable leakage.

  • Runtime credential rotation: The ConfigAdapter uses the Redpanda Connect Streams REST API (PUT /streams/{pipeline-id}) to hot-swap individual pipeline configurations with updated credentials. Only the affected stream restarts; other pipelines are unaffected.

  • Persistence: Redpanda Connect is stateless by design and does not persist pipeline configurations to disk or database. Pipeline definitions are managed externally — either as files on a shared volume (watched via -w flag) or reconstructed by the ConfigAdapter on restart. The ConfigAdapter's init routine reads active pipelines from the CIVITAS database and re-establishes them via the Streams API. This aspect requires further design work.

  • Secret storage in Redpanda Connect: Resolved credentials reside only in process memory. Redpanda Connect scrubs known secret fields from API responses, but DSN strings and embedded credentials may not be fully scrubbed. Access to the Streams REST API (port 4195) must be restricted to the ConfigAdapter via Kubernetes Network Policies.

Consequences

  • The Data Management Portal backend must implement Vault write operations when creating or updating datasource credentials.
  • The ConfigAdapter must be extended with Vault read capabilities (via Spring Cloud Vault or the Vault Java SDK) and must implement the credential resolution logic described above.
  • A Vault instance (or OpenBao) must be added to the platform deployment (Ansible/Helm). Vault operational concerns (unsealing, backup, HA) must be addressed in the deployment documentation.
  • Kubernetes Network Policies must be defined to restrict access to the pipeline runtime's management API.
  • The Redpanda Connect integration (Streams Mode, REST API, file-based persistence) requires a separate, dedicated design spike before implementation.

Alternatives

  • Sealed Secrets (Bitnami) / SOPS (Mozilla): Static, GitOps-oriented secrets management. Discarded because they do not support dynamic credential rotation at runtime, which is required for long-running data pipelines with potentially hundreds of datasources.
  • Kubernetes Secrets (native) with External Secrets Operator: Provides a Kubernetes-native abstraction over external secret stores. Discarded as a primary approach because K8s Secrets are pod-scoped and not easily injectable into individual pipeline streams within a shared Redpanda Connect process. However, ESO remains a viable complementary tool for other platform components.
  • Storing encrypted credentials in the CIVITAS database: Discarded because it would require implementing custom encryption key management, rotation logic, and access control — effectively re-inventing a secrets manager. This contradicts the principle of preferring standard solutions over custom development.

See also