Backup & Restore
This document identifies what must be backed up in a Civitas Core V2 deployment and describes procedures for backup and restore.
What Must Be Backed Up
| Data | Component | Storage Type | Criticality |
|---|---|---|---|
| PostgreSQL databases | CloudNativePG cluster | PersistentVolumeClaim | Critical — contains Keycloak realms/users, Portal data, Authz policies, FROST data |
| Kafka data | Strimzi Kafka brokers | PersistentVolumeClaim | High — contains event streams, Apicurio schema registry (KafkaSQL) |
| Etcd data | Etcd StatefulSet | PersistentVolumeClaim | High — contains APISIX route configurations |
| Kubernetes Secrets | secrets component | Kubernetes API (etcd) | Critical — contains generated passwords, TLS certs, database credentials |
| Deployment configuration | deployment/ directory | Local filesystem / Git | Critical — environment-specific Helmfile configuration |
Kubernetes Secrets are not stored in PersistentVolumes but in the cluster's etcd. They must be backed up separately, either via Velero or by exporting them manually.
Recommended Tool: Velero
Velero is the recommended tool for backing up and restoring Kubernetes resources and persistent volumes. It supports scheduled backups, volume snapshots, and disaster recovery.
Prerequisites
- Velero installed in the cluster (installation guide)
- A storage backend configured (S3-compatible object storage, Azure Blob, GCS, etc.)
- Volume snapshot support (CSI snapshotter or Velero's Restic/Kopia integration)
Example Velero Configuration
Install Velero with an S3-compatible backend:
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.11.0 \
--bucket civitas-backups \
--secret-file ./credentials-velero \
--backup-location-config region=eu-central-1,s3ForcePathStyle=true,s3Url=https://s3.example.com \
--snapshot-location-config region=eu-central-1 \
--use-node-agent \
--default-volumes-to-fs-backup
Scheduled Backups
Create a daily backup schedule for the Civitas namespace:
velero schedule create civitas-daily \
--schedule="0 2 * * *" \
--include-namespaces <instanceSlug> \
--ttl 720h \
--default-volumes-to-fs-backup
This creates a backup every day at 02:00 UTC, retaining backups for 30 days.
Manual Backup
velero backup create civitas-manual-$(date +%Y%m%d) \
--include-namespaces <instanceSlug> \
--default-volumes-to-fs-backup \
--wait
Verify the backup:
velero backup describe civitas-manual-<date>
velero backup logs civitas-manual-<date>
Component-Specific Backup Procedures
PostgreSQL (CloudNativePG)
CloudNativePG supports continuous backup to object storage natively. This is the preferred method for PostgreSQL backups as it provides point-in-time recovery (PITR).
Enabling CNPG Backup
Configure backup in your environment's postgres.yaml.gotmpl:
postgres:
cluster:
rawValues:
cluster:
backup:
barmanObjectStore:
destinationPath: "s3://civitas-pg-backups/"
endpointURL: "https://s3.example.com"
s3Credentials:
accessKeyId:
name: pg-backup-s3
key: ACCESS_KEY_ID
secretAccessKey:
name: pg-backup-s3
key: SECRET_ACCESS_KEY
retentionPolicy: "30d"
Create the S3 credentials secret:
kubectl create secret generic pg-backup-s3 \
--from-literal=ACCESS_KEY_ID='your-access-key' \
--from-literal=SECRET_ACCESS_KEY='your-secret-key' \
-n <instanceSlug>
Manual CNPG Backup
Trigger an on-demand backup:
kubectl apply -f - <<EOF
apiVersion: postgresql.cnpg.io/v1
kind: Backup
metadata:
name: pg-backup-$(date +%Y%m%d)
namespace: <instanceSlug>
spec:
cluster:
name: postgres-cluster
method: barmanObjectStore
EOF
Check backup status:
kubectl get backup -n <instanceSlug>
Kafka (Strimzi)
Kafka data is stored in PersistentVolumeClaims. Backup options:
- Velero volume snapshots (recommended): Back up the Kafka PVCs as part of the namespace-level Velero backup.
- Topic-level backup: Use tools like MirrorMaker 2 for cross-cluster replication.
Kafka's log retention policy (log.retention.ms: 2592000000 = 30 days by default) means older messages are automatically deleted. For long-term retention, configure topic-level retention or use an external sink.
Kubernetes Secrets
Secrets are auto-generated by the secrets component on first deployment. If lost, they cannot be regenerated with the same values. Back them up explicitly:
# Export all secrets from the instance namespace
kubectl get secrets -n <instanceSlug> -o yaml > secrets-backup-$(date +%Y%m%d).yaml
Store secret backups encrypted and in a secure location. They contain passwords for all platform components.
Velero includes Kubernetes Secrets by default when backing up a namespace.
Restore Procedures
Full Restore with Velero
Order of operations:
- Ensure the target cluster meets all prerequisites
- Install Velero with the same backend configuration as the source cluster
- Restore the namespace:
velero restore create civitas-restore \
--from-backup civitas-daily-<timestamp> \
--include-namespaces <instanceSlug> \
--wait
- Verify the restore:
# Check all pods are running
kubectl get pods -n <instanceSlug>
# Check Helm releases
helm list -n <instanceSlug> -a
# Check CNPG cluster status
kubectl get cluster -n <instanceSlug>
# Check Kafka cluster status
kubectl get kafka -n <instanceSlug>
- Run
helmfile applyto reconcile any drift:
helmfile -f deployment/helmfile.yaml apply -e <environment>
PostgreSQL Restore (CNPG)
To restore PostgreSQL from a CNPG Barman backup, configure recovery in your environment:
postgres:
cluster:
recovery:
s3:
accessKey: "your-access-key"
secretKey: "your-secret-key"
rawValues:
cluster:
bootstrap:
recovery:
source: postgres-cluster
recoveryTarget:
targetTime: "2025-01-15T10:00:00Z" # Point-in-time recovery
externalClusters:
- name: postgres-cluster
barmanObjectStore:
destinationPath: "s3://civitas-pg-backups/"
endpointURL: "https://s3.example.com"
s3Credentials:
accessKeyId:
name: pg-backup-s3
key: ACCESS_KEY_ID
secretAccessKey:
name: pg-backup-s3
key: SECRET_ACCESS_KEY
Then redeploy the PostgreSQL component:
helmfile -f deployment/helmfile.yaml apply -e <environment> --selector component=postgres
Secrets-Only Restore
If you only need to restore secrets (e.g., after accidental deletion), apply the backup file created during the Kubernetes Secrets backup step:
kubectl apply -f secrets-backup-<date>.yaml
Then restart affected pods to pick up the restored secrets:
kubectl rollout restart deployment -n <instanceSlug>
Success Criteria
A restore is considered successful when:
- All pods are in
Running/Completedstate -
kubectl get cluster -n <instanceSlug>shows PostgreSQL cluster asCluster in healthy state -
kubectl get kafka -n <instanceSlug>shows Kafka cluster asReady - Portal is accessible at
https://portal.<domain>and login works - Keycloak admin console is accessible at
https://idm.<domain>/admin - Previously created data (users, entities, configurations) is present