ADR 045: Use JSON, JSON Schema, and JSONata for the Model-Centric Data Flow
Status
Date: 2026-06-19 Accepted
Decision Makers: Department Assembly (Abteilungsversammlung)
Context
For CIVITAS/CORE v2, the model-centric data flow needs a technology stack that allows business intent to be authored directly, validated consistently, and transformed into technical projections without coupling the solution to a specific runtime or a custom code-generation pipeline.
The model-centric approach is not meant as model-driven software generation at runtime. It means that the model itself remains the source of truth and that technology-specific realizations are derived from that model in a controlled and transparent way.
In practical terms, the solution must support:
- plain model artifacts that can be inspected and exchanged without special tooling
- structural validation that is explicit and repeatable
- derivation of downstream representations for events, target configurations, and technical integration points
- a stack that can grow incrementally without forcing the team into a heavyweight modeling ecosystem too early
The decision was evaluated against two competing approaches:
- Fennec EMF + ModelAtlas
- JSON-Schema + LinkML
The decision needed to answer three questions:
- Which primary representation do we use for model artifacts?
- How do we validate structure and completeness?
- How do we derive downstream views, envelopes, and target-specific configuration from the canonical model?
The evaluation in the attached decision paper compared the options along model interoperability, standard compliance, maintainability, performance, integration capability, and lifecycle cost. All three candidate stacks could cover the core requirements, but they do so with different trade-offs in ownership, ecosystem weight, and implementation complexity.
Checked Architecture Principles
| Rating | Principle |
|---|---|
| full | Model-centric data flow |
| full | Distributed architecture with unified user experience |
| full | Modular design |
| full | Integration capability through defined interfaces |
| full | Open source as the default |
| full | Cloud-native architecture |
| full | Prefer standard solutions over custom development |
| full | Self-contained deployment |
| full | Technological consistency to ensure maintainability |
| none | Multi-tenancy |
| partial | Security by design |
Decision
We adopt the combination JSON + JSON Schema + JSONata as the technology basis for implementing the model-centric data flow in CIVITAS/CORE v2.
- JSON is the canonical representation for model artifacts.
- JSON Schema defines the structural contract, validation rules, and evolution boundaries for those artifacts.
- JSONata is used to derive downstream views and target-specific projections from the canonical model.
This decision was approved by the Department Assembly. It is intentionally model-centric: the source model is expressed directly as data, not as generated runtime code.
This choice keeps the implementation close to the data format that the rest of the platform already understands. JSON is widely supported across tooling and services, JSON Schema provides a mature structural contract, and JSONata offers a declarative transformation layer that stays readable for the team.
Rationale
- Readable and portable. JSON is easy to inspect, diff, exchange, and persist.
- Strong structural validation. JSON Schema gives explicit contracts for shape, required fields, types, and compatibility rules.
- Declarative transformation layer. JSONata enables controlled derivation of technical views without writing ad hoc mapping code for each target.
- Stable source of truth. The model remains the authoritative input; technical representations are derived from it.
- Low implementation overhead. The stack uses established standards and avoids a custom DSL or runtime code generation framework.
- Incremental adoption. The stack can start small and be extended as use cases grow, without a platform-wide tooling migration.
- Good fit for the platform direction. The chosen technology aligns with the model-centric data flow and the desire to keep the authoring language understandable for domain users and architects alike.
Consequences
- Model artifacts can be authored and reviewed as plain JSON documents.
- Validation becomes explicit and repeatable across tools and processes.
- Transformation logic is declarative and can be tested independently from the consuming runtime.
- Downstream components may consume derived JSON representations without changing the canonical model.
- The team must define conventions for schema versioning, transformation ownership, and naming.
- This decision does not introduce runtime code generation; the model is transformed as data, not compiled into executable business code at runtime.
- The implementation needs a clear split between canonical model, derived views, and target-specific adapters so that responsibilities remain understandable.
- Governance must cover schema evolution, backwards compatibility, and the ownership of transformation expressions.
- The selected stack favors clarity and portability over maximal modeling expressiveness.
Alternatives considered
Fennec EMF + ModelAtlas
This option was attractive because it offers a rich modeling ecosystem with a strong metamodeling story. It can support model linking, transformation, and persistence in one integrated environment. The drawback is that it brings a comparatively heavy stack into the core of the platform and therefore creates a stronger ownership obligation.
The attached decision paper highlights exactly this tension: the approach is powerful, but the community would need to commit to long-term maintenance, integration, and lifecycle ownership of a modeling environment that is not lightweight. For the expected v2 scope, that is a significant operational and organizational commitment.
JSON-Schema + LinkML
This option was the strongest competitor to the selected stack. It combines a dedicated modeling language with JSON Schema as the transport and validation layer. The result is expressive and standards-friendly, and it offers a good balance between structure and interoperability.
It was nevertheless not selected as the primary path because it introduces an additional ecosystem and an extra conceptual layer on top of the JSON-based model artifacts. For the immediate implementation path, the team preferred a solution that stays closer to the native JSON representation, keeps the learning curve lower, and allows the model-centric data flow to grow step by step.
Why the selected stack won
The chosen JSON + JSON Schema + JSONata combination is less ambitious than the most feature-rich modeling options, but it is easier to explain, easier to operate, and easier to evolve. That matters because the model-centric data flow is not a side feature; it is a central platform principle. A solution that can be understood by the wider team and supported over the long term has higher practical value than a more powerful stack that would require substantial additional governance from day one.