Skip to main content
Version: V2-Next

Data Modelling

Model Management manages domain data structures, their relationships, versions, dependencies and transformations. It uses a PostgreSQL-backed artifact registry (owned by Model Management) as the persistent artifact store and builds a semantic model layer on top of it: the registry does not manage files as primary objects, but domain model artifacts that are represented as JSON Schema.

JSON Schema as the canonical model

Every DataStructure is modelled as a standalone JSON Schema 2020-12 document:

{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "urn:core:platform:civitas:datastructure:common:WeatherObservation:1.0.0",
"title": "WeatherObservation",
"type": "object",
"properties": {
"sensorId": { "type": "string" },
"location": { "$ref": "urn:core:platform:civitas:datastructure:common:GeoPoint:1.0.0" }
}
}

Model Management does not introduce a separate attribute, class or slot metamodel. The domain truth lives in the JSON Schema itself.

Persistence model

Every DataStructure is stored as its own artifact row (with one artifact_version per version), keyed by the logical URN (without version). Versions are backend-owned SemVer; the version is simultaneously the last segment of the versioned URN:

FieldValue
artifact typedatastructure (the authored format jsonschema/xsd is a per-version representation, not part of the type)
logical URNurn:core:platform:civitas:datastructure:common:WeatherObservation
versions1.0.0, 1.1.0, 2.0.0, …

Updates request a SemVer bump (?versionBump=patch|minor|major, default patch); the backend assigns the next version and never overwrites an older one. Model Management does not perform schema-compatibility checks.

References between schemas

Structural composition — embedding one DataStructure's shape into another — is modelled exclusively through standard JSON Schema mechanisms — $ref, allOf, oneOf, anyOf, $defs; no CORE-specific extensions are required. A by-reference link (a foreign key by URN) instead uses the x-core-ref annotation, described at the end of this section.

Inheritance and extension: allOf

{
"allOf": [
{ "$ref": "urn:core:platform:civitas:datastructure:common:ObservationBase:1.0.0" },
{
"type": "object",
"properties": {
"temperature": { "type": "number" }
}
}
]
}

Polymorphism: oneOf

{
"oneOf": [
{ "$ref": "urn:core:platform:civitas:datastructure:common:MqttSource:1.0.0" },
{ "$ref": "urn:core:platform:civitas:datastructure:common:HttpSource:1.0.0" }
]
}

Combinable variants: anyOf

{
"anyOf": [
{ "$ref": "urn:core:platform:civitas:datastructure:common:TemperatureSensor:1.0.0" },
{ "$ref": "urn:core:platform:civitas:datastructure:common:HumiditySensor:1.0.0" }
]
}

Each external $ref becomes an edge in the dependency graph and a stored artifact reference (a row in artifact_reference); the full graph, including cycles, is stored. This enables reference validation, dependency analysis, impact analysis and the generated views.

Reference by URN (foreign key): x-core-ref

The mechanisms above ($ref, allOf, oneOf, anyOf) embed the target — composition: the value is an instance shaped like the referenced schema (inlined only when a dereferenced or bundled view is requested; the default read keeps the raw $ref URN). To link a DataStructure to another without embedding it, a plain string field carries the target's CORE URN and is marked with x-core-ref — a foreign key (UML association): nothing is embedded, the target stays an independent artifact, and its existence is checked against the registry on import. It is the by-reference counterpart to $ref's by-value embedding — see the $ref vs. x-core-ref mapping.

{
"stehtAn": {
"type": "string",
"pattern": "^urn:",
"x-core-ref": { "type": "urn:core:platform:civitas:datastructure:common:Strasse:1.0.0" }
}
}

Registry semantics

Model Management interprets all JSON Schema references as a directed model graph. Per DataStructure the registry knows:

  • References — outgoing $refs in the content
  • Dependencies / Dependents — forward and reverse edges in the graph
  • Versions — the stored artifact_version history
  • Metadata — title, authored format (jsonschema | xsd), availableFormats (stored + derivable), content type

Domain relationships beyond data structure (ownership, governance, responsibilities) are deliberately not part of the core model. The core is based exclusively on JSON Schema, JSON Schema references, the stored artifact references and the model graph that emerges from them.

Views over the same artifacts

Different consumers need different projections of the same model inventory. All views are generated on demand and never stored:

ViewPurpose
Persistence viewThe individual stored artifact version (raw stored schema)
Registry viewSemantic view: references, dependencies, dependents, versions
Dereferenced viewFully resolved single document — runtime, code generation, export
Bundled viewAll transitive dependencies embedded under $defs — portable, self-contained
TypeScript / ZodGenerated client types (published as @civitasconnect/core-model-forge-types)

The dereferenced and bundled views are described with examples in Model-Centric Data Flow.

DataSets as composition objects

DataSets contain no embedded data structures — they reference registry objects by URN:

{
"$schema": "https://civitasconnect.digital/core-dataset/v1",
"id": "urn:core:platform:civitas:dataset:common:weather-import:1.0.0",
"dataStructureRefs": [
"urn:core:platform:civitas:datastructure:common:WeatherObservation:1.0.0",
"urn:core:platform:civitas:datastructure:common:NgsiObservation:1.0.0"
],
"mappingRefs": [
"urn:core:platform:civitas:mapping:common:weather-to-ngsi:1.0.0"
],
"pipelineRefs": [
"urn:core:platform:civitas:pipeline:common:weather-import:1.0.0"
]
}

DataSets are domain composition objects; the actual models remain reusable registry artifacts. The full manifest format is specified in the CORE-IR Reference.

Long-term vision

Model Management forms the central semantic knowledge base of the platform: all modelled objects (DataStructures, Mappings, Pipelines, DataSources, DataSinks, DataSets) have global identities and are managed as one coherent model graph. JSON Schema describes the structure, the PostgreSQL registry stores artifacts, versions and references — and Model Management interprets the semantic model graph and provides different views over the same body of knowledge, staying compatible with JSON Schema, OpenAPI, TypeScript and Zod without maintaining a second semantic model.