Skip to main content

ADR 038: Use Apache NiFi as the Pipeline Engine for Data Integration and Transformation

Date: 2026-03-25

Status: Proposed

Supersedes: ADR 037 (Redpanda Connect as default ETL pipeline runtime)

Decision Makers: @DerLinne, Mario, @luckey

Context

CIVITAS/CORE v2 requires a pipeline engine that ingests data from heterogeneous sources, transforms payloads, and delivers results to downstream targets. The engine must support scheduled, event-driven, and request/response execution patterns, operate within a Kubernetes namespace with limited privileges, and provide isolation between pipelines belonging to different DataSets and users.

A systematic evaluation of open-source candidates was conducted (see Lösungsraum). After applying the requirements from requirements.md, all but two candidates were eliminated (see ruled_out.md). The remaining finalists -- Apache NiFi and Redpanda Connect -- were compared in detail (see comparison.md) and architectural sketches were developed for both (see sketch_nifi.md and sketch_redpanda_connect.md).

Why the decision is being revisited

ADR 037 proposed Redpanda Connect as the default pipeline runtime. During the subsequent detailed architecture work, several structural challenges with Redpanda Connect emerged that significantly increase the custom development effort:

  1. Multi-process orchestration required: With hundreds to thousands of pipelines expected, a 1-pod-per-pipeline model is not viable. Multiple redpanda-connect processes must run within shared pods, requiring a custom Provisioner and Supervisor to manage process lifecycle, assignment, and health monitoring.

  2. No built-in RBAC or multi-tenancy: Redpanda Connect has no concept of users, roles, or permissions. All access control must be built from scratch in the Provisioner.

  3. HTTP endpoint routing complexity: Pipelines that serve HTTP endpoints require dedicated ports per process within a shared pod, dynamic Ingress routing with port mapping, and re-routing on pipeline migration between pods.

  4. Non-trivial scale-down: Deterministic pod scale-down requires Kubernetes API access (Pod Deletion Cost annotations) from the Provisioner, creating additional infrastructure coupling.

  5. No UI: An administrative and monitoring UI must be built entirely from scratch.

In contrast, Apache NiFi provides these capabilities out of the box -- RBAC, a web UI, flow management, backpressure, data provenance -- at the cost of a higher resource baseline and a UI-first paradigm.

Checked Architecture Principles

  • [partial] Model-driven data flow
  • [full] Distributed architecture with unified user experience
  • [full] Modular design
  • [full] Integrability via well-defined interfaces
  • [full] Open source by default
  • [full] Cloud-native architecture
  • [full] Standard solutions before custom development
  • [full] Self-contained deployment
  • [partial] Technological consistency to ensure maintainability
  • [full] Multi-tenancy
  • [partial] Security by design

Comments on partial ratings:

  • Model-driven data flow: NiFi uses its own flow-based paradigm (Process Groups, FlowFiles, Connections). Pipeline definitions in CIVITAS/CORE must remain platform-owned artifacts; the Configuration-Adapter translates platform models into NiFi Process Groups via the NiFi REST API. NiFi is the execution substrate, not the source of truth.
  • Technological consistency: NiFi is Java-based, which is consistent with the existing Spring Boot stack. However, it introduces its own runtime concepts (FlowFiles, Provenance, Controller Services, Bulletin Board) that the team must learn. The steep learning curve is a known trade-off.
  • Security by design: NiFi provides built-in RBAC with feingranular policies on Process Group, Processor, and Connection level, and supports LDAP/OIDC for user management. However, all Process Groups share a JVM -- there is no container-level isolation between pipelines. Per the threat model assessment (APT-level protection not required), this is acceptable. Network-level isolation is enforced via Kubernetes NetworkPolicies on the NiFi pods.

Decision

CIVITAS/CORE v2 adopts Apache NiFi as the pipeline engine for data integration and transformation.

This means:

  • Apache NiFi is the execution engine for ingestion, routing, transformation, enrichment, and delivery of data within CIVITAS/CORE pipelines.
  • NiFi is deployed as a StatefulSet cluster (2+ nodes) within the platform's Kubernetes namespace. No CRDs or ClusterRoles are required.
  • Each DataSet pipeline is modeled as a NiFi Process Group with dedicated policies controlling which users/groups can view, modify, and operate the pipeline.
  • Pipeline definitions remain platform-owned artifacts. The CIVITAS/CORE domain model and registry are the source of truth; NiFi Process Groups are derived runtime artifacts, managed via the Configuration-Adapter through the NiFi REST API.
  • Pipeline definitions are versioned in an external registry. The Configuration-Adapter imports versioned flow snapshots into NiFi.
  • NiFi's built-in RBAC is used for pipeline-level access control, with Civitas roles and DataSet permissions mapped to NiFi policies and user groups.
  • NiFi's Data Provenance provides auditability and replay capabilities for data flowing through pipelines.

Consequences

  • The Configuration-Adapter must be developed to translate Civitas pipeline models into NiFi Process Groups via the NiFi REST API, and to map Civitas RBAC to NiFi policies.
  • Resource requirements are higher than with Redpanda Connect: a 2-node NiFi cluster requires 4-8 GB RAM as baseline. This is a known trade-off for the built-in capabilities.
  • Pipeline isolation is logical (Process Group policies), not physical (container/process). This is acceptable per the threat model but must be documented as a security constraint. Processors with code execution capabilities (ExecuteScript, ExecuteGroovyScript, ExecuteStreamCommand) must be restricted to admin accounts via NiFi root-level policies. This mitigates the "authenticated user (managing)" threat class without requiring process-level isolation.

Note: Redpanda Connect provides stronger runtime isolation (OS-process-level separation with independent heaps), whereas NiFi's shared JVM means that unrestricted processors (e.g. ExecuteScript) could access memory of co-located pipelines. This trade-off is consciously accepted in favour of NiFi's built-in RBAC and reduced custom development effort. Mitigation: restrict ExecuteScript and similar processors via NiFi policies to admin-only access.

  • The NiFi web UI can be used for operational monitoring and debugging. The decision on whether to expose the NiFi UI to platform users or to build a custom UI on top of the NiFi REST API is deferred to a subsequent design decision.
  • Monitoring integration via JMX/Prometheus Reporter into the central observability stack is required.
  • The platform should keep its pipeline model sufficiently engine-agnostic so that a future runtime change remains possible.

Why NiFi over Redpanda Connect

DimensionApache NiFiRedpanda Connect
RBACBuilt-in, feingranularMust be built from scratch
Pipeline-IsolationLogisch (RBAC-Policies auf Process Groups), aber shared JVM — kein Schutz gegen bösartigen Code innerhalb der JVMOS-Prozess-Isolation (separater Heap, PID), aber shared Pod-Netzwerk
UIBuilt-in web UIMust be built from scratch
Multi-tenancyProcess Groups with policiesCustom Provisioner + Supervisor
HTTP endpointsNative HandleHttpRequest/ResponseComplex port mapping in shared pods
BackpressureBuilt-in on Connection levelmax_in_flight, but no flow-level backpressure
Data ProvenanceBuilt-in audit trail and replayNot available
Custom developmentMittel (Configuration-Adapter, RBAC-Mapping)Hoch (Provisioner, Supervisor, RBAC, UI, Ingress-Routing)
Maturity>10 Jahre, Apache FoundationJünger, Redpanda Inc.
Resource overheadHoch (4-8 GB baseline)Gering (10-30 MB pro Prozess)
Dev experienceSteile Lernkurve, UI-firstExzellent, YAML/Code-first

The deciding factor is the significantly lower custom development effort. NiFi provides RBAC, UI, multi-tenancy, backpressure, provenance, and flow management out of the box. With Redpanda Connect, all of these would need to be built, tested, and maintained as custom code -- a Provisioner, Supervisor, RBAC layer, dynamic Ingress routing, and an administrative UI. The resource overhead of NiFi is a conscious trade-off for reducing development and operational complexity.

Alternatives

  • Redpanda Connect: Rejected as the default pipeline engine due to the high custom development effort required for multi-process orchestration, RBAC, UI, and HTTP endpoint routing in a multi-tenant platform scenario. Redpanda Connect remains a viable option for lightweight, standalone integration tasks outside the core pipeline engine. See sketch_redpanda_connect.md for the detailed architectural analysis.
  • Kestra: Rejected because core security features (RBAC, SSO/OIDC) are only available in the commercial Enterprise Edition. See ruled_out.md.
  • Argo Workflows: Rejected because it requires CRDs and ClusterRoles. See ruled_out.md.
  • Apache Flink, Kafka Streams, Apache Airflow, and others: Rejected for various reasons documented in ruled_out.md.

See also