Data Modelling
Model Management manages domain data structures, their relationships, versions, dependencies and transformations. It uses a PostgreSQL-backed artifact registry (owned by Model Management) as the persistent artifact store and builds a semantic model layer on top of it: the registry does not manage files as primary objects, but domain model artifacts that are represented as JSON Schema.
JSON Schema as the canonical model
Every DataStructure is modelled as a standalone JSON Schema 2020-12 document:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "urn:core:platform:civitas:datastructure:common:WeatherObservation:1.0.0",
"title": "WeatherObservation",
"type": "object",
"properties": {
"sensorId": { "type": "string" },
"location": { "$ref": "urn:core:platform:civitas:datastructure:common:GeoPoint:1.0.0" }
}
}
Model Management does not introduce a separate attribute, class or slot metamodel. The domain truth lives in the JSON Schema itself.
Persistence model
Every DataStructure is stored as its own artifact row (with one
artifact_version per version), keyed by the logical URN (without version).
Versions are backend-owned SemVer; the version is simultaneously the last
segment of the versioned URN:
| Field | Value |
|---|---|
| artifact type | datastructure (the authored format jsonschema/xsd is a per-version representation, not part of the type) |
| logical URN | urn:core:platform:civitas:datastructure:common:WeatherObservation |
| versions | 1.0.0, 1.1.0, 2.0.0, … |
Updates request a SemVer bump (?versionBump=patch|minor|major, default
patch); the backend assigns the next version and never overwrites an older one.
Model Management does not perform schema-compatibility checks.
References between schemas
Structural composition — embedding one DataStructure's shape into another — is
modelled exclusively through standard JSON Schema mechanisms — $ref, allOf, oneOf,
anyOf, $defs; no CORE-specific extensions are required. A by-reference link
(a foreign key by URN) instead uses the x-core-ref annotation, described at the end of
this section.
Inheritance and extension: allOf
{
"allOf": [
{ "$ref": "urn:core:platform:civitas:datastructure:common:ObservationBase:1.0.0" },
{
"type": "object",
"properties": {
"temperature": { "type": "number" }
}
}
]
}
Polymorphism: oneOf
{
"oneOf": [
{ "$ref": "urn:core:platform:civitas:datastructure:common:MqttSource:1.0.0" },
{ "$ref": "urn:core:platform:civitas:datastructure:common:HttpSource:1.0.0" }
]
}
Combinable variants: anyOf
{
"anyOf": [
{ "$ref": "urn:core:platform:civitas:datastructure:common:TemperatureSensor:1.0.0" },
{ "$ref": "urn:core:platform:civitas:datastructure:common:HumiditySensor:1.0.0" }
]
}
Each external $ref becomes an edge in the dependency graph and a stored
artifact reference (a row in artifact_reference); the full graph, including
cycles, is stored.
This enables reference validation, dependency analysis, impact analysis and the
generated views.
Reference by URN (foreign key): x-core-ref
The mechanisms above ($ref, allOf, oneOf, anyOf) embed the target — composition:
the value is an instance shaped like the referenced schema (inlined only when a dereferenced or
bundled view is requested; the default read keeps the raw $ref URN). To link a
DataStructure to another without embedding it, a plain string field carries the target's CORE
URN and is marked with x-core-ref — a foreign key (UML
association): nothing is embedded, the target stays an independent artifact, and its existence
is checked against the registry on import. It is the by-reference counterpart to $ref's
by-value embedding — see the
$ref vs. x-core-ref mapping.
{
"stehtAn": {
"type": "string",
"pattern": "^urn:",
"x-core-ref": { "type": "urn:core:platform:civitas:datastructure:common:Strasse:1.0.0" }
}
}
Registry semantics
Model Management interprets all JSON Schema references as a directed model graph. Per DataStructure the registry knows:
- References — outgoing
$refs in the content - Dependencies / Dependents — forward and reverse edges in the graph
- Versions — the stored
artifact_versionhistory - Metadata — title, authored format (
jsonschema|xsd),availableFormats(stored + derivable), content type
Domain relationships beyond data structure (ownership, governance, responsibilities) are deliberately not part of the core model. The core is based exclusively on JSON Schema, JSON Schema references, the stored artifact references and the model graph that emerges from them.
Views over the same artifacts
Different consumers need different projections of the same model inventory. All views are generated on demand and never stored:
| View | Purpose |
|---|---|
| Persistence view | The individual stored artifact version (raw stored schema) |
| Registry view | Semantic view: references, dependencies, dependents, versions |
| Dereferenced view | Fully resolved single document — runtime, code generation, export |
| Bundled view | All transitive dependencies embedded under $defs — portable, self-contained |
| TypeScript / Zod | Generated client types (published as @civitasconnect/core-model-forge-types) |
The dereferenced and bundled views are described with examples in Model-Centric Data Flow.
DataSets as composition objects
DataSets contain no embedded data structures — they reference registry objects by URN:
{
"$schema": "https://civitasconnect.digital/core-dataset/v1",
"id": "urn:core:platform:civitas:dataset:common:weather-import:1.0.0",
"dataStructureRefs": [
"urn:core:platform:civitas:datastructure:common:WeatherObservation:1.0.0",
"urn:core:platform:civitas:datastructure:common:NgsiObservation:1.0.0"
],
"mappingRefs": [
"urn:core:platform:civitas:mapping:common:weather-to-ngsi:1.0.0"
],
"pipelineRefs": [
"urn:core:platform:civitas:pipeline:common:weather-import:1.0.0"
]
}
DataSets are domain composition objects; the actual models remain reusable registry artifacts. The full manifest format is specified in the CORE-IR Reference.
Long-term vision
Model Management forms the central semantic knowledge base of the platform: all modelled objects (DataStructures, Mappings, Pipelines, DataSources, DataSinks, DataSets) have global identities and are managed as one coherent model graph. JSON Schema describes the structure, the PostgreSQL registry stores artifacts, versions and references — and Model Management interprets the semantic model graph and provides different views over the same body of knowledge, staying compatible with JSON Schema, OpenAPI, TypeScript and Zod without maintaining a second semantic model.