Data Architect
Role overview
Your mission: You define how data is structured, processed, and managed across the platform. Your goal is to establish standards, integrate data, and ensure that data can be consistently used and reused.
Why your role is vital: You are the "Architect". You create the foundation that enables all data-related work. Without your structures, standards, and integrations, Datasets cannot be created, processed, or reused.
Defining your scope: You work across the platform on Data structures, Data sources, Pipelines and APIs. You define standards and enable others to create, manage, and use data-related elements.
Your core responsibilities
- Define standards and guidelines: You define Data structures, metadata standards, and reusable patterns across the tenant.
- Manage data architecture: You create and maintain Data structures, Data sources, and pipelines across the platform.
- Enable data processing: You ensure data can be ingested, transformed, stored, and provided consistently.
- Support the organization: You work across domains and support other Roles in building and using data-related elements.
Outside your scope
It is not your responsibility to release Datasets or govern access policies.
If you are responsible for reviewing and releasing Datasets, the Data Owner or Data Gatekeeper role is likely the right role for you.
To work effectively with data-related elements, you need the correct permissions.
The access logic: Access is always the result of a User being assigned to a Group, and that Group being assigned a Role at a specific Scope.
Scopes: Roles apply either at Platform level or on a specific data-related element such as a Dataset, Data source, or Data structure.
To create and manage data-related elements, your Group typically needs a Data Role with create Permissions at Platform scope.
→ Deep Dive Authorization Model
Typical tasks
Your work in CIVITAS/CORE focuses on designing and managing data-related elements across the platform:
- Define and maintain Data structures: Create and version Data structures based on defined standards
- Register and configure Data sources: Connect systems and ensure data is ingested correctly
- Design pipelines: Model how data is ingested, transformed, and prepared for use
- Prepare Datasets for further use: Ensure Datasets are complete and ready for review by Data Owner or Data Gatekeeper
Your first steps
To start working as a Data Architect in CIVITAS/CORE:
- Check that your Group has a Data Role with create Permissions at Platform scope
- Explore existing Data structures and Data sources in the platform
- Create your first Data structure
- Create your first Data source and link it to the Data structure
- Create your first Dataset and start modeling your first pipeline
- Manage access permissions in a Dataset and collaborate with Data Stewards
Assign Data Stewards access only to specific Datasets, Data structures and Data sources.
This ensures:
- clear ownership
- controlled access
- better data quality
Best practices & avoiding mistakes
- Work with reusable Data structures: Design Data structures so they can be reused across multiple Datasets
- Keep pipelines clear and maintainable: Avoid overly complex pipeline logic. Keep transformations understandable
- Separate responsibilities: Focus on building and preparing data. Data Stewards handle the detailed configuration of Datasets, while Data Owners and Gatekeepers are responsible for release decisions
- Ensure proper access setup: Make sure your Group has the required platform-wide Data Role to create data-related elements
- Enable Data Stewards through access: Assign Data Roles to Groups on the relevant Datasets, Data sources, and Data structures so Data Stewards can work on them.
When working with pipelines, make sure access is assigned across all involved data-related elements.
If a Dataset uses a Data source or Data structure, the required Group + Role assignment must exist on each of them.
Otherwise, your collaborators may not be able to select Data sources or Data structures and configure or run pipelines.
Key terms to know
To work effectively as a Data Architect, review these terms in our Glossary:
- Data structure: Defines the schema and structure of data
- Data source: Represents the origin of data.
- Dataset: A structured collection of data prepared for use
- Pipeline: Defines how data is ingested, transformed, and provided
- Data Role: A Role that grants access to data-related elements
- Scope: Defines where a Role applies, either on Platform level or on a specific data-related element