Skip to main content
Version: V2-Next

Transactional Outbox Pattern

Problem

The Portal Backend manages entities (users, groups, roles) and must synchronize changes to external systems (e.g., Keycloak) via Configuration Adapters. The original synchronous approach caused several issues:

  1. The Portal Backend published a ConfigEvent to Kafka and waited for a ConfigResultEvent
  2. The database transaction remained open during the entire round-trip
  3. If the external system was slow or unavailable, the API request failed

This led to long-running transactions, tight runtime coupling to Kafka and the Config Adapter, and user-facing errors when downstream systems were temporarily unavailable.

Solution

The Transactional Outbox pattern (ADR 030) replaces the synchronous request-reply flow with an asynchronous, eventually consistent model.

Core Principle

Entity changes and their corresponding outbox events are persisted in a single database transaction. A separate publisher process reads the outbox table and publishes events to Kafka after the transaction has committed.

Flow Comparison

Before: Synchronous (replaced)

After: Asynchronous Outbox

Synchronization State

Each synchronized entity maintains an explicit state that reflects its progress in external system synchronization:

StateDescriptionAction
NOT_SYNCEDChange persisted locally, synchronization pendingAutomatic: outbox publisher sends event
SYNCEDChange successfully applied to external systemNone
FAILED_RETRIABLESynchronization failed, will be retriedAutomatic: retry with backoff
FAILED_PERMANENTSynchronization failed, requires manual interventionOperational: investigate and resolve

Key Characteristics

What changes compared to the synchronous model:

  • API requests complete immediately after local database commit
  • Database transactions are short-lived and decoupled from external systems
  • The Portal Backend is resilient to temporary Kafka or Config Adapter outages
  • Synchronization with external systems is eventually consistent

What stays the same:

  • Existing Kafka topics and event formats are preserved
  • CloudEvents envelope structure is unchanged
  • Config Adapter interface remains the same

Failure Handling

  • Temporary failures (network timeouts, service unavailability) are retried automatically until synchronization succeeds or a retry limit is reached
  • Permanent failures (invalid data, authorization errors) are marked explicitly and require operational attention
  • The original entity change is never rolled back -- the local state is authoritative

Database Changes

The pattern requires two additions to the Portal Backend database:

  1. Sync state column on synchronized entities -- tracks NOT_SYNCED, SYNCED, FAILED_RETRIABLE, FAILED_PERMANENT
  2. Outbox table -- stores pending events for reliable publication to Kafka