Skip to main content

Backup & Restore

This document identifies what must be backed up in a Civitas Core V2 deployment and describes procedures for backup and restore.

What Must Be Backed Up

DataComponentStorage TypeCriticality
PostgreSQL databasesCloudNativePG clusterPersistentVolumeClaimCritical — contains Keycloak realms/users, Portal data, Authz policies, FROST data
Kafka dataStrimzi Kafka brokersPersistentVolumeClaimHigh — contains event streams, Apicurio schema registry (KafkaSQL)
Etcd dataEtcd StatefulSetPersistentVolumeClaimHigh — contains APISIX route configurations
Kubernetes Secretssecrets componentKubernetes API (etcd)Critical — contains generated passwords, TLS certs, database credentials
Deployment configurationdeployment/ directoryLocal filesystem / GitCritical — environment-specific Helmfile configuration
warning

Kubernetes Secrets are not stored in PersistentVolumes but in the cluster's etcd. They must be backed up separately, either via Velero or by exporting them manually.

Velero is the recommended tool for backing up and restoring Kubernetes resources and persistent volumes. It supports scheduled backups, volume snapshots, and disaster recovery.

Prerequisites

  • Velero installed in the cluster (installation guide)
  • A storage backend configured (S3-compatible object storage, Azure Blob, GCS, etc.)
  • Volume snapshot support (CSI snapshotter or Velero's Restic/Kopia integration)

Example Velero Configuration

Install Velero with an S3-compatible backend:

velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.11.0 \
--bucket civitas-backups \
--secret-file ./credentials-velero \
--backup-location-config region=eu-central-1,s3ForcePathStyle=true,s3Url=https://s3.example.com \
--snapshot-location-config region=eu-central-1 \
--use-node-agent \
--default-volumes-to-fs-backup

Scheduled Backups

Create a daily backup schedule for the Civitas namespace:

velero schedule create civitas-daily \
--schedule="0 2 * * *" \
--include-namespaces <instanceSlug> \
--ttl 720h \
--default-volumes-to-fs-backup

This creates a backup every day at 02:00 UTC, retaining backups for 30 days.

Manual Backup

velero backup create civitas-manual-$(date +%Y%m%d) \
--include-namespaces <instanceSlug> \
--default-volumes-to-fs-backup \
--wait

Verify the backup:

velero backup describe civitas-manual-<date>
velero backup logs civitas-manual-<date>

Component-Specific Backup Procedures

PostgreSQL (CloudNativePG)

CloudNativePG supports continuous backup to object storage natively. This is the preferred method for PostgreSQL backups as it provides point-in-time recovery (PITR).

Enabling CNPG Backup

Configure backup in your environment's postgres.yaml.gotmpl:

postgres:
cluster:
rawValues:
cluster:
backup:
barmanObjectStore:
destinationPath: "s3://civitas-pg-backups/"
endpointURL: "https://s3.example.com"
s3Credentials:
accessKeyId:
name: pg-backup-s3
key: ACCESS_KEY_ID
secretAccessKey:
name: pg-backup-s3
key: SECRET_ACCESS_KEY
retentionPolicy: "30d"

Create the S3 credentials secret:

kubectl create secret generic pg-backup-s3 \
--from-literal=ACCESS_KEY_ID='your-access-key' \
--from-literal=SECRET_ACCESS_KEY='your-secret-key' \
-n <instanceSlug>

Manual CNPG Backup

Trigger an on-demand backup:

kubectl apply -f - <<EOF
apiVersion: postgresql.cnpg.io/v1
kind: Backup
metadata:
name: pg-backup-$(date +%Y%m%d)
namespace: <instanceSlug>
spec:
cluster:
name: postgres-cluster
method: barmanObjectStore
EOF

Check backup status:

kubectl get backup -n <instanceSlug>

Kafka (Strimzi)

Kafka data is stored in PersistentVolumeClaims. Backup options:

  1. Velero volume snapshots (recommended): Back up the Kafka PVCs as part of the namespace-level Velero backup.
  2. Topic-level backup: Use tools like MirrorMaker 2 for cross-cluster replication.
info

Kafka's log retention policy (log.retention.ms: 2592000000 = 30 days by default) means older messages are automatically deleted. For long-term retention, configure topic-level retention or use an external sink.

Kubernetes Secrets

Secrets are auto-generated by the secrets component on first deployment. If lost, they cannot be regenerated with the same values. Back them up explicitly:

# Export all secrets from the instance namespace
kubectl get secrets -n <instanceSlug> -o yaml > secrets-backup-$(date +%Y%m%d).yaml
warning

Store secret backups encrypted and in a secure location. They contain passwords for all platform components.

Velero includes Kubernetes Secrets by default when backing up a namespace.

Restore Procedures

Full Restore with Velero

Order of operations:

  1. Ensure the target cluster meets all prerequisites
  2. Install Velero with the same backend configuration as the source cluster
  3. Restore the namespace:
velero restore create civitas-restore \
--from-backup civitas-daily-<timestamp> \
--include-namespaces <instanceSlug> \
--wait
  1. Verify the restore:
# Check all pods are running
kubectl get pods -n <instanceSlug>

# Check Helm releases
helm list -n <instanceSlug> -a

# Check CNPG cluster status
kubectl get cluster -n <instanceSlug>

# Check Kafka cluster status
kubectl get kafka -n <instanceSlug>
  1. Run helmfile apply to reconcile any drift:
helmfile -f deployment/helmfile.yaml apply -e <environment>

PostgreSQL Restore (CNPG)

To restore PostgreSQL from a CNPG Barman backup, configure recovery in your environment:

postgres:
cluster:
recovery:
s3:
accessKey: "your-access-key"
secretKey: "your-secret-key"
rawValues:
cluster:
bootstrap:
recovery:
source: postgres-cluster
recoveryTarget:
targetTime: "2025-01-15T10:00:00Z" # Point-in-time recovery
externalClusters:
- name: postgres-cluster
barmanObjectStore:
destinationPath: "s3://civitas-pg-backups/"
endpointURL: "https://s3.example.com"
s3Credentials:
accessKeyId:
name: pg-backup-s3
key: ACCESS_KEY_ID
secretAccessKey:
name: pg-backup-s3
key: SECRET_ACCESS_KEY

Then redeploy the PostgreSQL component:

helmfile -f deployment/helmfile.yaml apply -e <environment> --selector component=postgres

Secrets-Only Restore

If you only need to restore secrets (e.g., after accidental deletion), apply the backup file created during the Kubernetes Secrets backup step:

kubectl apply -f secrets-backup-<date>.yaml

Then restart affected pods to pick up the restored secrets:

kubectl rollout restart deployment -n <instanceSlug>

Success Criteria

A restore is considered successful when:

  • All pods are in Running / Completed state
  • kubectl get cluster -n <instanceSlug> shows PostgreSQL cluster as Cluster in healthy state
  • kubectl get kafka -n <instanceSlug> shows Kafka cluster as Ready
  • Portal is accessible at https://portal.<domain> and login works
  • Keycloak admin console is accessible at https://idm.<domain>/admin
  • Previously created data (users, entities, configurations) is present