Transactional Outbox Pattern
Problem
The Portal Backend manages entities (users, groups, roles) and must synchronize changes to external systems (e.g., Keycloak) via Configuration Adapters. The original synchronous approach caused several issues:
- The Portal Backend published a ConfigEvent to Kafka and waited for a ConfigResultEvent
- The database transaction remained open during the entire round-trip
- If the external system was slow or unavailable, the API request failed
This led to long-running transactions, tight runtime coupling to Kafka and the Config Adapter, and user-facing errors when downstream systems were temporarily unavailable.
Solution
The Transactional Outbox pattern (ADR 030) replaces the synchronous request-reply flow with an asynchronous, eventually consistent model.
Core Principle
Entity changes and their corresponding outbox events are persisted in a single database transaction. A separate publisher process reads the outbox table and publishes events to Kafka after the transaction has committed.
Flow Comparison
Before: Synchronous (replaced)
After: Asynchronous Outbox
Synchronization State
Each synchronized entity maintains an explicit state that reflects its progress in external system synchronization:
| State | Description | Action |
|---|---|---|
NOT_SYNCED | Change persisted locally, synchronization pending | Automatic: outbox publisher sends event |
SYNCED | Change successfully applied to external system | None |
FAILED_RETRIABLE | Synchronization failed, will be retried | Automatic: retry with backoff |
FAILED_PERMANENT | Synchronization failed, requires manual intervention | Operational: investigate and resolve |
Key Characteristics
What changes compared to the synchronous model:
- API requests complete immediately after local database commit
- Database transactions are short-lived and decoupled from external systems
- The Portal Backend is resilient to temporary Kafka or Config Adapter outages
- Synchronization with external systems is eventually consistent
What stays the same:
- Existing Kafka topics and event formats are preserved
- CloudEvents envelope structure is unchanged
- Config Adapter interface remains the same
Failure Handling
- Temporary failures (network timeouts, service unavailability) are retried automatically until synchronization succeeds or a retry limit is reached
- Permanent failures (invalid data, authorization errors) are marked explicitly and require operational attention
- The original entity change is never rolled back -- the local state is authoritative
Database Changes
The pattern requires two additions to the Portal Backend database:
- Sync state column on synchronized entities -- tracks
NOT_SYNCED,SYNCED,FAILED_RETRIABLE,FAILED_PERMANENT - Outbox table -- stores pending events for reliable publication to Kafka