Theoretical Foundations
Welcome to the curriculum workspace. Here you will find long-form technical guidelines outlining core architectural blueprints and implementation mechanics.
Module 11: Distributed Transactions & The Saga Pattern
PREREQUISITE STATEMENT: This module assumes familiarity with microservices architecture and bounded context boundaries from Module 8: Monoliths to Domain-Driven Microservices, and asynchronous event streams from Module 9: Event-Driven Architectures. If you have not completed those modules, do so before proceeding. The Saga Pattern is the dominant mechanism for managing distributed state integrity in production microservices systems.
Introduction: The Consistency Problem in Distributed Systems
A monolithic application handles business workflows with a simple, powerful guarantee: the ACID database transaction. When a user places an order, the application opens a single transaction, debits the inventory count, charges the payment instrument, and creates the order record — and if anything fails partway through, the entire operation is rolled back atomically. The database returns the system to its pre-operation state as if nothing happened.
This guarantee evaporates the moment you decompose the monolith into microservices.
In a microservices architecture following Domain-Driven Design principles, each service owns its data. The InventoryService manages its own PostgreSQL instance. The PaymentService manages its own separate database. The OrderService owns a third. These services communicate over a network. There is no shared transaction context across these three databases. No single BEGIN TRANSACTION / COMMIT statement can atomically span three independent database instances across a network partition boundary.
This is not a theoretical concern or an edge case. It is the fundamental tension at the heart of distributed systems design: how do you maintain business-level consistency across multiple services, each with its own autonomous database, when any individual service call can fail at any point?
The Saga Pattern is the industry-standard answer.
Section 1: Why ACID Transactions Fail at Cloud Scale
A. The ACID Properties
Before examining why ACID fails in distributed contexts, it is important to be precise about what ACID guarantees in a single-node system:
- Atomicity: All operations within a transaction either succeed completely or fail completely. There is no partial success. If a
COMMITis issued and the power fails before it reaches disk, the database engine recovers to the pre-transaction state on restart. - Consistency: A transaction brings the database from one valid state to another. All data integrity rules (foreign keys, check constraints, uniqueness constraints) are enforced at commit time.
- Isolation: Concurrent transactions executing simultaneously appear to each other as if they ran serially. One transaction cannot observe the intermediate state of another.
- Durability: Once a transaction commits, the data is permanently stored. It will survive process crashes, OS failures, and power outages.
These four properties, working together in a single-node relational database, provide an extraordinarily powerful consistency guarantee. Decades of enterprise software engineering have been built on them.
B. The Cross-Service Transaction Problem
Consider the following order placement workflow across three services:
[OrderService DB] [PaymentService DB] [InventoryService DB]
| | |
| 1. INSERT order | 2. CHARGE card | 3. RESERVE stock
| (status=PENDING) | ($99.00) | (qty -= 1)
| | |
If step 3 fails (item out of stock) after steps 1 and 2 have already committed to their respective databases, you are left with:
- An order record in
PENDINGstate in the OrderService database. - A $99.00 charge on the customer's credit card in the PaymentService database.
- No stock reservation in the InventoryService database.
The system is now in an inconsistent business state. The customer has been charged for an item that cannot be fulfilled. In a single-node ACID database, the ROLLBACK statement would have undone all three operations. In a distributed system, there is no ROLLBACK that spans service boundaries.
C. Two-Phase Commit (2PC) — The Classic Solution and Why It Fails
The database community developed a protocol called Two-Phase Commit (2PC) to solve exactly this problem. 2PC introduces a dedicated Coordinator node that orchestrates a distributed atomic commit across multiple participants.
Phase 1: The Prepare Phase
The coordinator sends a PREPARE message to all participating database nodes. Each participant must:
- Write the pending changes to a write-ahead log (WAL) on disk.
- Acquire all necessary locks on the rows being modified.
- Respond to the coordinator with
VOTE_COMMIT(ready) orVOTE_ABORT(cannot proceed).
During this phase, every participant is in a state of lock suspension — they have committed to the operation on disk and hold locks but have not yet committed or rolled back.
Phase 2: The Commit or Abort Phase
If the coordinator receives VOTE_COMMIT from all participants:
- It writes
COMMITto its own WAL. - It sends
COMMITmessages to all participants. - Each participant commits its local transaction and releases locks.
If any participant votes VOTE_ABORT, or if a participant does not respond within a timeout:
- The coordinator sends
ABORTto all participants. - Each participant rolls back their pending changes and releases locks.
[Coordinator]
|
|--- PREPARE ---> [OrderService DB] --> VOTE_COMMIT
|--- PREPARE ---> [PaymentService DB] --> VOTE_COMMIT
|--- PREPARE ---> [InventoryService DB]--> VOTE_ABORT
|
|<-- All votes received (one ABORT)
|
|--- ABORT -----> [OrderService DB] --> Rollback
|--- ABORT -----> [PaymentService DB] --> Rollback
|--- ABORT -----> [InventoryService DB]--> Rollback
2PC Failure Modes
Despite its elegance, 2PC has three critical failure modes that render it unsuitable for high-scale cloud systems:
1. The Coordinator Failure (Blocking Protocol)
2PC is a blocking protocol: if the coordinator fails after sending PREPARE but before sending COMMIT or ABORT, all participants are stuck. They have:
- Written their changes to disk (cannot simply discard them without coordinator instruction).
- Acquired their row locks.
- No way to determine the coordinator's intent without coordinator recovery.
In this state, the participant databases hold locks indefinitely, blocking all other transactions that need those rows. The participants literally cannot proceed — they cannot commit (might violate atomicity if coordinator intended abort) and cannot abort (might violate atomicity if coordinator intended commit). The system is deadlocked until the coordinator recovers.
2. Network Partitions During Phase 2
If a network partition occurs after the coordinator sends COMMIT to some participants but before others receive it, some databases commit the transaction while others have not. The distributed system reaches an inconsistent state — some services reflect the new state, others do not — with no automatic recovery mechanism.
3. Availability Impact from Lock Holding
The prepare phase forces participant databases to hold locks on all modified rows until the commit/abort decision arrives. Under high transaction volume, this serializes access to hot rows. Mathematically, if the coordinator node has an availability of $A_c$, and there are $N$ participant nodes each with availability $A_p$, the availability of the entire 2PC transaction is:
$$A_{\text{2PC}} = A_c \times A_p^N$$
For a system with a coordinator at 99.9% uptime and three participants each at 99.9% uptime:
$$A_{\text{2PC}} = 0.999 \times 0.999^3 = 0.999^4 \approx 99.6%$$
Each additional participant node multiplies the risk of a blocking failure. At cloud scale with dozens of microservices, 2PC reduces overall system availability to unacceptably low levels.
The Fundamental Problem
2PC trades availability for atomicity. The CAP theorem (covered in Module 7: Distributed Computing Realities) tells us this trade-off is unavoidable. In cloud-scale systems where partition tolerance and availability are non-negotiable, 2PC is architecturally incompatible.
Section 2: The Saga Pattern — Definition and Origins
The Saga Pattern was first formally defined in a 1987 research paper by Hector Garcia-Molina and Kenneth Salem titled "Sagas," published in the proceedings of the ACM SIGMOD conference. The paper addressed long-lived transactions (LLTs) — database transactions that span seconds, minutes, or even hours — and the problem of holding database locks for the entire duration.
Their insight: a long-lived transaction can be broken down into a sequence of shorter, individual database transactions. If any individual transaction in the sequence fails, the system executes compensating transactions in reverse order to undo the previously completed steps. Critically, these compensating transactions are themselves ACID transactions — they are standard, lock-free database operations that reverse the semantic effect of the original step.
A. Formal Definition
A Saga $S$ for a business transaction involving $N$ steps is defined as:
$$S = {T_1, T_2, T_3, \ldots, T_N}$$
Each $T_i$ is a local ACID transaction within a single service boundary. For every $T_i$, a compensating transaction $C_i$ must exist such that:
$$T_i \cdot C_i = \text{no-op}$$
The execution rules are:
- Success path: Execute $T_1, T_2, \ldots, T_N$ in sequence. If all complete, the Saga is done.
- Failure at step $k$: Execute $C_{k-1}, C_{k-2}, \ldots, C_1$ in reverse order to restore the prior consistent state.
The system guarantees that it will reach either the fully committed state (all $T_i$ completed) or the fully compensated state (all $C_i$ executed), but never a partial intermediate state.
B. The Two Implementation Styles
Sagas can be implemented in two architecturally distinct ways:
| Property | Orchestration | Choreography |
|---|---|---|
| Coordination mechanism | Centralized orchestrator process | Decentralized event stream |
| State tracking | Orchestrator holds saga state | Reconstructed from event log |
| Coupling | Services coupled to orchestrator | Services coupled only to event schema |
| Visibility | Single process to inspect/debug | Requires distributed tracing |
| Single point of failure | Orchestrator must be HA | None — any service can fail independently |
| Business logic location | Concentrated in orchestrator | Distributed across services |
| Best suited for | Complex workflows with conditional paths | Simple linear chains of events |
Section 3: Orchestration Sagas
A. Architecture
In an orchestration-based Saga, a dedicated Saga Orchestrator service acts as the coordinator for the entire business workflow. The orchestrator knows the complete sequence of steps, the services responsible for each step, and the compensating actions to take on failure. Individual services expose command APIs; the orchestrator drives them in sequence.
[Saga Orchestrator]
|
+--------------------+--------------------+
| | |
[OrderService] [PaymentService] [InventoryService]
createOrder() chargeCard() reserveStock()
cancelOrder() refundCard() releaseStock()
The orchestrator maintains a saga state machine that tracks the current step, the outcome of each step, and the rollback path if needed.
B. Temporal.io and AWS Step Functions
Production orchestration sagas are implemented using dedicated workflow engines:
Temporal.io is an open-source workflow orchestration engine that provides durable execution of code — if the Temporal worker process crashes mid-workflow, the workflow automatically resumes from the exact step where it left off when the worker restarts. Temporal serializes workflow state into its own database after every activity (step) completion, providing exactly-once execution semantics even across server restarts.
AWS Step Functions provides a managed, serverless equivalent. Workflows are defined as JSON-based state machine definitions. Each state transitions to the next on success or to a compensating branch on failure. AWS handles durability, retries, and state persistence transparently.
C. TypeScript Implementation
The following example illustrates an orchestration-based Saga for an e-commerce order placement workflow. For clarity, service calls are abstracted as async functions.
// Domain events and commands
interface SagaStep<TInput, TOutput> {
execute(input: TInput): Promise<TOutput>;
compensate(input: TInput, result: TOutput): Promise<void>;
}
interface OrderContext {
orderId: string;
customerId: string;
items: Array<{ productId: string; quantity: number; price: number }>;
totalAmount: number;
paymentTransactionId?: string;
reservationId?: string;
}
// Step 1: Create Order
class CreateOrderStep implements SagaStep<OrderContext, { orderId: string }> {
constructor(private orderService: OrderService) {}
async execute(ctx: OrderContext) {
const order = await this.orderService.createOrder({
customerId: ctx.customerId,
items: ctx.items,
status: 'PENDING',
});
return { orderId: order.id };
}
async compensate(ctx: OrderContext, result: { orderId: string }) {
await this.orderService.cancelOrder(result.orderId, 'SAGA_ROLLBACK');
}
}
// Step 2: Charge Payment
class ChargePaymentStep implements SagaStep<OrderContext, { transactionId: string }> {
constructor(private paymentService: PaymentService) {}
async execute(ctx: OrderContext) {
const tx = await this.paymentService.charge({
customerId: ctx.customerId,
amount: ctx.totalAmount,
idempotencyKey: `order-${ctx.orderId}-charge`,
});
return { transactionId: tx.id };
}
async compensate(ctx: OrderContext, result: { transactionId: string }) {
await this.paymentService.refund({
transactionId: result.transactionId,
reason: 'ORDER_CANCELLED',
idempotencyKey: `order-${ctx.orderId}-refund`,
});
}
}
// Step 3: Reserve Inventory
class ReserveInventoryStep implements SagaStep<OrderContext, { reservationId: string }> {
constructor(private inventoryService: InventoryService) {}
async execute(ctx: OrderContext) {
const reservation = await this.inventoryService.reserve({
orderId: ctx.orderId,
items: ctx.items,
idempotencyKey: `order-${ctx.orderId}-reserve`,
});
return { reservationId: reservation.id };
}
async compensate(ctx: OrderContext, result: { reservationId: string }) {
await this.inventoryService.release(result.reservationId);
}
}
// The Orchestrator
class OrderPlacementSaga {
private steps: SagaStep<OrderContext, any>[];
private completedSteps: Array<{ step: SagaStep<OrderContext, any>; result: any }> = [];
constructor(
private orderStep: CreateOrderStep,
private paymentStep: ChargePaymentStep,
private inventoryStep: ReserveInventoryStep,
private sagaRepository: SagaStateRepository,
) {
this.steps = [orderStep, paymentStep, inventoryStep];
}
async execute(ctx: OrderContext): Promise<void> {
const sagaId = await this.sagaRepository.create(ctx.orderId, 'ORDER_PLACEMENT');
for (const step of this.steps) {
try {
const result = await step.execute(ctx);
this.completedSteps.push({ step, result });
await this.sagaRepository.recordStepSuccess(sagaId, step.constructor.name, result);
} catch (error) {
console.error(`Saga step failed: ${step.constructor.name}`, error);
await this.sagaRepository.recordFailure(sagaId, step.constructor.name, error);
await this.compensate(ctx);
throw new SagaFailureError(`Order saga failed at ${step.constructor.name}`, error);
}
}
await this.sagaRepository.markComplete(sagaId);
}
private async compensate(ctx: OrderContext): Promise<void> {
// Execute compensating transactions in reverse order
const stepsToCompensate = [...this.completedSteps].reverse();
for (const { step, result } of stepsToCompensate) {
try {
await step.compensate(ctx, result);
} catch (compensationError) {
// Compensation failures are critical — alert operations team
console.error(`CRITICAL: Compensation failed for ${step.constructor.name}`, compensationError);
// In production: emit to dead-letter queue, page on-call engineer
}
}
}
}
D. Saga State Persistence
A critical requirement for orchestration sagas is durable state storage. If the orchestrator process crashes mid-execution, it must be able to resume from the last successfully completed step rather than restarting the entire saga. The SagaStateRepository in the example above persists step results so that on restart, the orchestrator can reconstruct completedSteps from the database and continue without re-executing already-successful operations.
Section 4: Choreography Sagas
A. Architecture
In a choreography-based Saga, there is no central orchestrator. Instead, services communicate exclusively through domain events published to a shared message broker (Kafka, RabbitMQ, AWS EventBridge). Each service listens for events it cares about, performs its local transaction, and emits new events for downstream services to consume.
[OrderService] [PaymentService] [InventoryService]
| | |
emit: OrderCreated | |
| | |
|--- OrderCreated event ---------->| |
| process payment |
| emit: PaymentProcessed |
| | |
| |--- PaymentProcessed -------->|
| | reserve stock
| | emit: StockReserved
| | |
|<---- StockReserved event --------|------------------------------|
update order | |
status: CONFIRMED | |
The saga's state is implicitly distributed across the event log. At any given moment, the system's position within the saga is represented by the most recently emitted event.
B. TypeScript Implementation
The following illustrates choreography across three services. Each service is an independent process.
// Shared event types (published as schema registry entries)
interface DomainEvent {
eventId: string;
correlationId: string; // Tracks the saga across services
occurredAt: Date;
eventType: string;
}
interface OrderCreatedEvent extends DomainEvent {
eventType: 'ORDER_CREATED';
orderId: string;
customerId: string;
items: Array<{ productId: string; quantity: number; price: number }>;
totalAmount: number;
}
interface PaymentProcessedEvent extends DomainEvent {
eventType: 'PAYMENT_PROCESSED';
orderId: string;
transactionId: string;
amount: number;
}
interface StockAllocationFailedEvent extends DomainEvent {
eventType: 'STOCK_ALLOCATION_FAILED';
orderId: string;
reason: string;
failedItems: string[];
}
// OrderService — initiates the saga
class OrderService {
constructor(
private orderRepository: OrderRepository,
private eventBus: EventBus,
) {}
async placeOrder(command: PlaceOrderCommand): Promise<string> {
const order = await this.orderRepository.create({
customerId: command.customerId,
items: command.items,
status: 'PENDING',
});
await this.eventBus.publish<OrderCreatedEvent>({
eventId: generateId(),
correlationId: order.id, // order ID serves as saga correlation ID
occurredAt: new Date(),
eventType: 'ORDER_CREATED',
orderId: order.id,
customerId: command.customerId,
items: command.items,
totalAmount: command.totalAmount,
});
return order.id;
}
// Compensating handler — listens for PaymentRefunded
async onPaymentRefunded(event: PaymentRefundedEvent): Promise<void> {
await this.orderRepository.updateStatus(event.orderId, 'CANCELLED');
}
}
// PaymentService — listens for OrderCreated, emits PaymentProcessed or PaymentFailed
class PaymentService {
constructor(
private paymentGateway: PaymentGateway,
private paymentRepository: PaymentRepository,
private eventBus: EventBus,
) {}
async onOrderCreated(event: OrderCreatedEvent): Promise<void> {
// Idempotency check — prevent double-charging on event redelivery
const existing = await this.paymentRepository.findByIdempotencyKey(
`charge-${event.orderId}`,
);
if (existing) return;
try {
const transaction = await this.paymentGateway.charge({
customerId: event.customerId,
amount: event.totalAmount,
idempotencyKey: `charge-${event.orderId}`,
});
await this.paymentRepository.save(transaction);
await this.eventBus.publish<PaymentProcessedEvent>({
eventId: generateId(),
correlationId: event.correlationId,
occurredAt: new Date(),
eventType: 'PAYMENT_PROCESSED',
orderId: event.orderId,
transactionId: transaction.id,
amount: event.totalAmount,
});
} catch (error) {
await this.eventBus.publish({
eventId: generateId(),
correlationId: event.correlationId,
occurredAt: new Date(),
eventType: 'PAYMENT_FAILED',
orderId: event.orderId,
reason: error.message,
});
}
}
// Compensating handler — listens for StockAllocationFailed
async onStockAllocationFailed(event: StockAllocationFailedEvent): Promise<void> {
const payment = await this.paymentRepository.findByOrderId(event.orderId);
if (!payment) return;
await this.paymentGateway.refund({
transactionId: payment.transactionId,
idempotencyKey: `refund-${event.orderId}`,
});
await this.eventBus.publish({
eventId: generateId(),
correlationId: event.correlationId,
occurredAt: new Date(),
eventType: 'PAYMENT_REFUNDED',
orderId: event.orderId,
transactionId: payment.transactionId,
});
}
}
C. Event Subscription Wiring
The choreography pattern relies on the event broker correctly routing domain events to consuming services. In Kafka, this is achieved through topic subscriptions. Each service subscribes to one or more topics and processes incoming events through its handler pipeline.
Kafka Topic: order-events
-> Subscribers: PaymentService, NotificationService
Kafka Topic: payment-events
-> Subscribers: InventoryService, AnalyticsService
Kafka Topic: inventory-events
-> Subscribers: OrderService, WarehouseService, NotificationService
This topology means that adding a new downstream service (e.g., a fraud detection service that also needs to react to OrderCreated) requires zero changes to the OrderService — the new service simply subscribes to the order-events topic.
Section 5: Designing Compensating Transactions
A. What Makes a Good Compensating Transaction
Not all compensating transactions are created equal. A well-designed compensating transaction has four properties:
1. Idempotency The compensating transaction must be safe to execute multiple times. If a network failure causes the compensation message to be delivered twice, running the compensation a second time must have no additional effect. This is typically achieved through idempotency keys checked in the database before performing the compensating action.
2. Semantic Correctness The compensation must reverse the business meaning of the original operation, not just its database mutation. Deleting a row that was inserted is a technical undo. Issuing a refund for a charge is a semantic undo. These are not always equivalent.
3. Independence from Original Transaction The compensating transaction must not depend on the original transaction context being still open or available. It must be a standalone operation that succeeds or fails on its own.
4. Backward Compatibility If the compensating transaction reads data written by the original transaction, it must handle the case where that data has since been modified by other operations.
B. Semantic Undo vs. Technical Undo
The distinction between semantic and technical undo is crucial for designing correct sagas.
| Original Operation | Technical Undo (Wrong) | Semantic Undo (Correct) |
|---|---|---|
INSERT order record |
DELETE order row |
Update order status to CANCELLED, preserve audit record |
| Charge payment instrument | Delete payment record | Issue REFUND to customer's account |
| Send confirmation email | Cannot unsend email | Send ORDER_CANCELLED email to customer |
| Decrement inventory count | Increment inventory count | Issue stock reinstatement with reason code |
| Reserve a hotel room | Delete the reservation | Send cancellation notice, re-open date for booking |
Notice that "delete the payment record" is the technical inverse of "insert the payment record," but it is not the correct compensating transaction. The customer's bank has already processed the charge. Deleting the local record while the charge remains on the bank's side creates a financial discrepancy. The correct semantic compensation is a refund.
Similarly, you cannot un-send an email. Once a confirmation email has been delivered to the customer's inbox, it exists in their email client. The correct compensating transaction is to send a separate cancellation email.
C. The Compensating Transaction Matrix
For every saga, architects must define a complete compensating transaction matrix before implementation begins:
| Step | Forward Transaction | Compensating Transaction | Idempotency Mechanism |
|---|---|---|---|
| 1 | createOrder(PENDING) |
updateOrderStatus(CANCELLED) |
Order ID uniqueness constraint |
| 2 | chargePayment($amount) |
refundPayment(transactionId) |
refund-{orderId} idempotency key |
| 3 | reserveInventory(items) |
releaseReservation(reservationId) |
Reservation ID idempotency |
| 4 | sendConfirmationEmail() |
sendCancellationEmail() |
Email delivery log deduplication |
| 5 | notifyWarehouse(orderId) |
cancelWarehousePicklist(orderId) |
Picklist ID idempotency key |
This matrix must be reviewed during architectural design review before any code is written.
Section 6: Idempotency in Sagas
A. Why Idempotency Is Non-Negotiable
In a choreography saga, messages are transmitted over a distributed message broker (Kafka, RabbitMQ, SQS). These brokers guarantee at-least-once delivery: every message will be delivered to its consumer, but it may be delivered more than once under network failure conditions.
This means every saga step handler must be idempotent — calling the handler twice with the same input must produce the same outcome as calling it once.
Without idempotency:
- A payment handler receives
OrderCreatedtwice (due to broker retry after consumer crash). - The handler charges the customer's credit card twice.
- The customer is double-charged.
B. Implementing Idempotent Handlers
The standard mechanism is an idempotency key stored in the database. Before executing the business operation, the handler checks whether a record with the same idempotency key already exists.
class IdempotentPaymentHandler {
constructor(
private paymentGateway: PaymentGateway,
private idempotencyStore: IdempotencyStore,
private eventBus: EventBus,
) {}
async handle(event: OrderCreatedEvent): Promise<void> {
const idempotencyKey = `payment-charge-${event.orderId}`;
// Check if we have already processed this event
const existingResult = await this.idempotencyStore.get(idempotencyKey);
if (existingResult) {
console.log(`Duplicate event detected for key: ${idempotencyKey}. Skipping.`);
return; // Safe to return — operation already completed
}
// Acquire a distributed lock to prevent concurrent duplicate processing
const lock = await this.idempotencyStore.acquireLock(idempotencyKey, ttlMs: 30_000);
if (!lock) {
throw new Error(`Could not acquire lock for ${idempotencyKey}. Another instance is processing.`);
}
try {
const transaction = await this.paymentGateway.charge({
customerId: event.customerId,
amount: event.totalAmount,
externalReference: event.orderId,
});
// Store the result before releasing lock
await this.idempotencyStore.set(idempotencyKey, {
transactionId: transaction.id,
processedAt: new Date(),
eventId: event.eventId,
});
await this.eventBus.publish({
eventType: 'PAYMENT_PROCESSED',
orderId: event.orderId,
transactionId: transaction.id,
correlationId: event.correlationId,
});
} finally {
await this.idempotencyStore.releaseLock(idempotencyKey);
}
}
}
The idempotency store is typically implemented using a Redis SET NX (set if not exists) operation with a TTL, or a PostgreSQL INSERT ... ON CONFLICT DO NOTHING statement.
Section 7: Saga State Machine Diagrams
A. The Lifecycle State Machine
Every saga has a defined lifecycle. Visualizing this as a state machine is essential for implementation correctness and for generating the rollback playbook.
stateDiagram-v2
[*] --> OrderPending : placeOrder()
OrderPending --> PaymentProcessing : ORDER_CREATED emitted
PaymentProcessing --> InventoryReserving : PAYMENT_PROCESSED emitted
InventoryReserving --> OrderConfirmed : STOCK_RESERVED emitted
OrderConfirmed --> [*] : Saga Complete
PaymentProcessing --> CompensatingOrder : PAYMENT_FAILED
InventoryReserving --> CompensatingPayment : STOCK_ALLOCATION_FAILED
CompensatingPayment --> CompensatingOrder : PAYMENT_REFUNDED
CompensatingOrder --> OrderCancelled : ORDER_CANCELLED_RECORD
OrderCancelled --> [*] : Saga Rolled Back
B. Reading the State Machine
Every arc in this diagram corresponds to a domain event being published or consumed. The state machine makes the following guarantees explicit:
- The saga can only reach
OrderConfirmedif all three forward steps succeeded. - The saga can reach
OrderCancelledfrom two failure paths (payment failure or inventory failure) but both paths terminate at the same final cancelled state. - There is no path from
OrderCancelledback toOrderPending— the saga is a one-way progression.
In production systems, this state machine is encoded in the orchestrator (for orchestration sagas) or reconstructed by querying the event log for all events with a given correlationId (for choreography sagas).
Section 8: Forward Recovery vs. Backward Recovery
A. Two Recovery Strategies
When a saga step fails, there are two recovery strategies available. The correct choice depends on the nature of the failure.
Backward Recovery (Compensating Rollback)
Execute compensating transactions in reverse order for all previously completed steps. The saga ends in a fully-rolled-back state, semantically equivalent to the saga never having started. This is the most common strategy.
- When to use: The failure represents a genuine business constraint violation (e.g., insufficient inventory, declined credit card). The saga cannot logically proceed — the business rule says the order cannot be fulfilled.
Forward Recovery (Retry and Continue)
Retry the failed step, potentially with different parameters or after a waiting period. If the step eventually succeeds, the saga continues forward.
- When to use: The failure is transient (e.g., a temporary network partition, a downstream service momentarily overloaded). The operation is inherently retriable and will likely succeed on a subsequent attempt.
B. The Recovery Decision Matrix
| Failure Type | Failure Example | Recovery Strategy | Maximum Retry Attempts |
|---|---|---|---|
| Business constraint | Insufficient inventory | Backward (compensate) | 0 — fail immediately |
| Transient network | HTTP 503 Service Unavailable | Forward (retry with backoff) | 3–5 with exponential backoff |
| Transient resource | Database connection timeout | Forward (retry) | 3 with jitter |
| Data validation | Invalid card number | Backward (compensate) | 0 — fail immediately |
| Dependency outage | Payment gateway offline | Forward (retry, then backward) | 3 attempts, then compensate |
| Idempotency key conflict | Duplicate event delivery | No-op — already processed | Not applicable |
C. Exponential Backoff for Forward Recovery
When retrying a failed saga step, use exponential backoff with jitter to prevent thundering herd behavior (covered in depth in Module 0: Systems Thinking):
$$T_{\text{wait}} = \min(T_{\text{max}}, T_{\text{base}} \times 2^{\text{attempt}}) \times (0.5 + \text{rand}(0, 0.5))$$
For $T_{\text{base}} = 100\text{ms}$ and $T_{\text{max}} = 30\text{s}$:
| Attempt | Base Wait | With Jitter (range) |
|---|---|---|
| 1 | 200ms | 100ms – 200ms |
| 2 | 400ms | 200ms – 400ms |
| 3 | 800ms | 400ms – 800ms |
| 4 | 1,600ms | 800ms – 1,600ms |
| 5 | 3,200ms | 1,600ms – 3,200ms |
Section 9: Saga Anti-Patterns
A. Missing Compensating Transactions
The most dangerous anti-pattern is defining a saga step without a corresponding compensating transaction. If step $T_k$ has no compensation $C_k$, then any failure at step $T_{k+1}$ or later will leave the effects of $T_k$ permanently committed with no rollback path. This is equivalent to a data leak — resources are consumed with no mechanism for release.
Detection: Every saga step in the implementation must be paired with a compensate() method. Add a static analysis rule or unit test that verifies the pairing.
B. Non-Idempotent Step Handlers
Implementing step handlers that are not idempotent is the second most common anti-pattern. Because message brokers guarantee at-least-once delivery, a non-idempotent handler will produce incorrect results when a message is redelivered. This manifests as double-charges, duplicate inventory reservations, or duplicate records.
Detection: Every step handler must begin with an idempotency key lookup. Automated testing must exercise the handler with duplicate inputs and assert that the second invocation produces no additional side effects.
C. Temporal Coupling in Choreography
In a choreography saga, services must not make assumptions about the timing of events. A service that expects PaymentProcessed to always arrive before InventoryReserved is making a temporal coupling assumption. Message brokers do not guarantee ordering across partitions. If the InventoryService receives PaymentProcessed and StockReserved out of order, it must handle both sequences correctly.
Detection: Test the choreography saga with messages delivered in all possible orderings and verify that the final state is consistent regardless of arrival order.
D. Saga Explosion (Too Many Steps)
If a saga has more than 7–10 steps, it is a signal that either:
- The bounded context decomposition is too fine-grained (services are too small).
- The saga is crossing multiple Domain-Driven Design bounded contexts in a way that violates strategic boundaries.
Long sagas have a correspondingly long list of compensating transactions, higher probability of compensation failures, and more distributed state to track. Consider merging closely related services before extending a saga beyond 8 steps.
Section 10: Real-World Case Studies
A. Uber's Trip Lifecycle as a Saga
Every Uber trip is a distributed saga spanning multiple microservices:
RiderService: Rider requests a trip → emitsTripRequestedMatchingService: Matches rider to nearest driver → emitsDriverMatchedPaymentService: Pre-authorizes payment instrument → emitsPaymentPreAuthorizedNavigationService: Calculates route → emitsRouteCalculatedTripService: Activates trip → emitsTripStartedPaymentService: Captures final payment at trip end → emitsPaymentCapturedRatingService: Opens rating window → emitsRatingWindowOpened
If the driver cancels after step 3, the compensating transactions are:
- Release the payment pre-authorization.
- Remove the driver match.
- Update the trip status to
DRIVER_CANCELLED. - Emit
DriverCancelledNotificationto re-enter the matching queue.
B. Airline Booking Systems
Airline booking sagas are among the oldest production implementations of distributed transaction compensation. A flight booking involves steps across multiple independent systems: the airline's seat inventory system, the Global Distribution System (GDS such as Amadeus or Sabre), the payment processor, and the ticketing system.
Airlines pioneered the concept of "pending" booking states — the semantic equivalent of a saga in progress. A booking is not confirmed until all downstream steps have completed. If payment fails, the seat reservation is automatically released after a timeout (a time-based compensating transaction), preventing seats from being permanently locked by failed bookings.
C. E-Commerce at Amazon Scale
Amazon's order management system processes millions of orders per day across warehouses on multiple continents. Each order traverses: inventory allocation (potentially across multiple fulfillment centers), payment authorization and capture, shipping carrier integration, tax calculation services, and loyalty points systems.
Amazon's internal infrastructure for managing these distributed workflows predates Temporal.io and AWS Step Functions — both of which were later abstracted from internal Amazon tooling. The core insight that drove their architecture is that compensations must be business-correct, not technically correct. Cancelling an order after a warehouse has already printed a pick list requires cancelling the pick list in the Warehouse Management System, not deleting database rows.
Section 11: Orchestration vs. Choreography — The Decision Framework
Neither pattern is universally superior. The correct choice depends on the complexity of the workflow, team topology, and operational observability requirements.
Workflow Complexity
Low High
+---------------+------------------+
Coupling Low | Choreography | Choreography + |
Tolerance | (simple chain)| Process Manager |
+---------------+------------------+
High | Orchestration | Orchestration + |
| (clear owner) | Temporal.io/ |
| | Step Functions |
+---------------+------------------+
Choose Choreography when:
- The workflow is a simple linear chain with no conditional branching.
- Services are owned by different teams with independent deployment schedules.
- You want maximum service autonomy and no shared orchestration dependency.
- Your event stream (Kafka) already serves as the integration backbone.
Choose Orchestration when:
- The workflow has conditional branches (e.g., if order value > $500, require additional fraud check).
- You need centralized visibility into saga state for operations/debugging.
- Business stakeholders need a visual workflow definition they can review and approve.
- The saga involves long-running waits (e.g., wait for warehouse confirmation before proceeding — potentially hours).
- Your organization's on-call engineers need to inspect and manually advance stuck sagas.
Section 12: Mermaid Architecture Challenge
Challenge Overview
The following challenge tests your ability to model a distributed choreography saga with both the success path and the compensating rollback path in a single sequence diagram.
Requirements
Model a choreography-based Saga for an e-commerce order placement. The saga spans three services: OrderService, PaymentService, and InventoryService, all communicating asynchronously via an EventBroker.
Success Flow:
OrderServicecreates an order record and emitsOrderCreatedto theEventBroker.PaymentServicesubscribes toOrderCreated, processes the payment, and emitsPaymentDeducted.InventoryServicesubscribes toPaymentDeducted, reserves stock, and emitsStockReserved.OrderServicesubscribes toStockReservedand updates the order status toCONFIRMED.
Failure & Compensating Rollback Flow:
5. If InventoryService cannot reserve stock (out of stock), it emits StockAllocationFailed instead.
6. PaymentService subscribes to StockAllocationFailed, executes the refund compensating transaction, and emits PaymentRefunded.
7. OrderService subscribes to PaymentRefunded and updates the order record to CANCELLED.
Model both paths in a single sequenceDiagram block using alt / else / end to show the branching logic.
Reference Solution
sequenceDiagram
participant OS as OrderService
participant EB as EventBroker
participant PS as PaymentService
participant IS as InventoryService
OS->>OS: createOrder(status=PENDING)
OS->>EB: publish(OrderCreated)
EB->>PS: deliver(OrderCreated)
PS->>PS: chargeCard($amount)
PS->>EB: publish(PaymentDeducted)
EB->>IS: deliver(PaymentDeducted)
alt Stock Available
IS->>IS: decrementInventory(items)
IS->>EB: publish(StockReserved)
EB->>OS: deliver(StockReserved)
OS->>OS: updateOrder(status=CONFIRMED)
Note over OS,IS: Saga Complete — Success Path
else Out of Stock
IS->>EB: publish(StockAllocationFailed)
EB->>PS: deliver(StockAllocationFailed)
PS->>PS: refundCard(transactionId)
PS->>EB: publish(PaymentRefunded)
EB->>OS: deliver(PaymentRefunded)
OS->>OS: updateOrder(status=CANCELLED)
Note over OS,IS: Saga Complete — Compensated Rollback Path
end
Extension Challenges
Add Idempotency: Modify the sequence diagram to show each service performing a duplicate-detection check before processing each incoming event. Add a
Noteannotation showing the idempotency key format.Add Notification Service: Extend the diagram to include a
NotificationServicethat subscribes to bothStockReserved(to send confirmation email) andPaymentRefunded(to send cancellation email). Observe how choreography allows adding this new participant without modifying any existing service.Model the Compensating Transaction Failure: What happens if
PaymentServicefails to issue the refund during compensation? Add aCriticalCompensationFailureevent path that routes to aDeadLetterQueuefor manual operations team intervention. This represents a saga stuck state — an important operational scenario to design for explicitly.
Cross-References
- Module 7: Distributed Computing Realities & The CAP Theorem — foundational analysis of why distributed consistency is hard and the impossibility results that motivate sagas.
- Module 8: Monoliths to Domain-Driven Microservices — bounded context decomposition that determines where service boundaries are drawn and therefore where saga boundaries must be defined.
- Module 9: Event-Driven Architectures — Kafka, RabbitMQ, and the event streaming infrastructure that choreography sagas depend on. Consumer groups, partitioning, and idempotent consumers are all directly applicable to saga implementation.
- Module 12: CQRS & Event Sourcing — event sourcing provides a natural substrate for tracking saga state as an immutable event log. Process Managers in event-sourced systems often implement the orchestration saga pattern.
- Module 14: Fault Tolerance & Self-Healing Infrastructure — circuit breakers and retry policies interact directly with saga forward-recovery strategies. Understanding bulkhead isolation is essential when designing the failure containment boundaries of a multi-service saga.