SAGA Pattern in Microservices — A Complete Guide with Order Processing Example

The SAGA pattern is how you maintain data consistency across multiple microservices when a traditional database transaction (ACID, two-phase commit) is not an option. It breaks a single business transaction into a sequence of local transactions, each owned by one service. If any step fails, prior steps are reversed via compensating transactions.

This guide is the complete picture: why SAGA exists, the two implementation flavors, an order-processing walkthrough with real .NET code, the operational reality, and where this pattern fails.

Why SAGA exists

In a monolith, "place an order" looks like this:

BEGIN;
  INSERT INTO orders (...) VALUES (...);
  UPDATE inventory SET stock = stock - 1 WHERE sku = 'X';
  INSERT INTO payments (...) VALUES (...);
  UPDATE customers SET loyalty_points = loyalty_points + 10 WHERE id = ...;
COMMIT;

One database, one transaction. Atomic. Either all rows commit or none.

In microservices, every service owns its own database. Cross-service queries are forbidden. So "place an order" becomes:

Order Service        → INSERT INTO orders ...
Inventory Service    → UPDATE stock ...
Payment Service      → INSERT INTO payments ...
Customer Service     → UPDATE loyalty_points ...

Four databases. No single transaction can span them. A traditional 2PC (Two-Phase Commit) would lock all four databases and is too brittle at scale. SAGA replaces ACID guarantees with eventual consistency + explicit compensation.

Architecture in one diagram

                              SAGA — Order Processing
                              ───────────────────────

   ┌────────────┐ create   ┌────────────┐ reserve  ┌────────────┐ charge   ┌────────────┐
   │            │  order   │            │  stock   │            │  card    │            │
   │  Order     │─────────▶│ Inventory  │─────────▶│  Payment   │─────────▶│  Shipping  │
   │  Service   │          │  Service   │          │  Service   │          │  Service   │
   │            │          │            │          │            │          │            │
   └────────────┘          └────────────┘          └────────────┘          └────────────┘
         │                       │                       │                       │
         │                       │                       │                       │
         ▼ on failure            ▼ on failure            ▼ on failure            ▼ ✓ done
   ┌────────────┐          ┌────────────┐          ┌────────────┐
   │  cancel    │          │  release   │          │  refund    │     COMPENSATING
   │  order     │◀─────────│  stock     │◀─────────│  payment   │     TRANSACTIONS
   │            │          │            │          │            │
   └────────────┘          └────────────┘          └────────────┘
                                                                              
   forward path (left → right): each service does its local work, then triggers the next.
   reverse path (right → left): if any step fails, prior steps are undone in reverse order.

Three things to internalize:

Each step is a local ACID transaction in one database. No distributed lock.
Each forward step has a defined compensating step. "Reserved 1 unit" has a compensation: "release 1 unit".
Compensations run in reverse order. If payment fails, you release stock THEN cancel the order — same order the forward path took, reversed.

The two flavors — Choreography vs Orchestration

Choreography (event-driven, decentralized)

Each service publishes events. Other services listen and react.

Order Service                    Event Bus                   Inventory Service
     │                                │                              │
     │ ─OrderPlaced───────────────────▶                              │
     │                                │ ──OrderPlaced──────────────▶ │
     │                                │                              │ reserves stock
     │                                │ ◀──StockReserved──────────── │
     │                                │ ──StockReserved────────────▶ │ (to Payment)
     │                                │

No central coordinator. Every service knows its part. Loose coupling, but the workflow is invisible — to understand "place an order" end-to-end you have to read N service codebases.

Orchestration (central coordinator)

A dedicated SAGA orchestrator (often a state machine) sends commands and tracks state.

                ┌───────────────────┐
                │  Order Saga       │
                │  Orchestrator     │
                │  (state machine)  │
                └─┬───────┬───┬─────┘
                  │       │   │
       reserve───▶│       │   │◀───stockReserved
       stock      │       │   │
                  ▼       ▼   ▼
            ┌────────┐ ┌──────┐ ┌─────────┐
            │ Order  │ │ Inv. │ │ Payment │
            └────────┘ └──────┘ └─────────┘

One service is the source of truth for the workflow. Easier to reason about, easier to monitor. But it becomes a single point of complexity.

Which to pick

Decision factor	Choreography	Orchestration
Workflow visibility	Low — distributed	High — single state machine
Coupling	Loose	Tighter (orchestrator knows everyone)
Best when	2-3 steps, stable workflow	4+ steps, evolving workflow
Failure tracing	Hard (multi-service logs)	Easy (one log timeline)
Adding a new step	New event subscription per service	Edit the orchestrator state machine

Rule of thumb: start with Choreography for simple workflows. As complexity grows past ~3 steps, migrate to Orchestration.

Order processing — full code (Orchestration flavor, .NET)

Step 1 — Define the saga state and events

public enum OrderSagaState
{
    Started,
    StockReserved,
    PaymentCharged,
    ShippingScheduled,
    Completed,
    Failed,
    Compensating,
    Compensated
}

public record OrderSaga(
    Guid SagaId,
    Guid OrderId,
    Guid CustomerId,
    List<LineItem> Items,
    decimal Total,
    OrderSagaState State,
    DateTimeOffset StartedAt
);

Step 2 — The orchestrator

public class OrderSagaOrchestrator
{
    private readonly ISagaRepository _repo;
    private readonly IInventoryClient _inventory;
    private readonly IPaymentClient _payments;
    private readonly IShippingClient _shipping;
    private readonly IOrderClient _orders;
    private readonly ILogger<OrderSagaOrchestrator> _log;

    public async Task<OrderSaga> StartAsync(PlaceOrderCommand cmd, CancellationToken ct)
    {
        var saga = new OrderSaga(
            SagaId: Guid.NewGuid(),
            OrderId: Guid.NewGuid(),
            CustomerId: cmd.CustomerId,
            Items: cmd.Items,
            Total: cmd.Items.Sum(i => i.Price * i.Quantity),
            State: OrderSagaState.Started,
            StartedAt: DateTimeOffset.UtcNow);

        await _repo.SaveAsync(saga, ct);
        return await StepReserveStockAsync(saga, ct);
    }

    private async Task<OrderSaga> StepReserveStockAsync(OrderSaga saga, CancellationToken ct)
    {
        try
        {
            await _inventory.ReserveAsync(saga.OrderId, saga.Items, ct);
            saga = saga with { State = OrderSagaState.StockReserved };
            await _repo.SaveAsync(saga, ct);
            return await StepChargePaymentAsync(saga, ct);
        }
        catch (Exception ex)
        {
            _log.LogWarning(ex, "stock reservation failed for saga {SagaId}", saga.SagaId);
            return await FailAsync(saga, "stock_unavailable", ct);
        }
    }

    private async Task<OrderSaga> StepChargePaymentAsync(OrderSaga saga, CancellationToken ct)
    {
        try
        {
            await _payments.ChargeAsync(saga.OrderId, saga.CustomerId, saga.Total, ct);
            saga = saga with { State = OrderSagaState.PaymentCharged };
            await _repo.SaveAsync(saga, ct);
            return await StepScheduleShippingAsync(saga, ct);
        }
        catch (Exception ex)
        {
            _log.LogWarning(ex, "payment failed for saga {SagaId}", saga.SagaId);
            return await CompensateAsync(saga, "payment_declined", ct);
        }
    }

    private async Task<OrderSaga> StepScheduleShippingAsync(OrderSaga saga, CancellationToken ct)
    {
        try
        {
            await _shipping.ScheduleAsync(saga.OrderId, ct);
            saga = saga with { State = OrderSagaState.Completed };
            await _orders.MarkCompletedAsync(saga.OrderId, ct);
            await _repo.SaveAsync(saga, ct);
            return saga;
        }
        catch (Exception ex)
        {
            _log.LogWarning(ex, "shipping failed for saga {SagaId}", saga.SagaId);
            return await CompensateAsync(saga, "shipping_failed", ct);
        }
    }

    private async Task<OrderSaga> CompensateAsync(OrderSaga saga, string reason, CancellationToken ct)
    {
        saga = saga with { State = OrderSagaState.Compensating };
        await _repo.SaveAsync(saga, ct);

        // Reverse order: payment refund → stock release → order cancel
        if (saga.State.HasReached(OrderSagaState.PaymentCharged))
            await _payments.RefundAsync(saga.OrderId, saga.Total, ct);

        if (saga.State.HasReached(OrderSagaState.StockReserved))
            await _inventory.ReleaseAsync(saga.OrderId, saga.Items, ct);

        await _orders.CancelAsync(saga.OrderId, reason, ct);
        saga = saga with { State = OrderSagaState.Compensated };
        await _repo.SaveAsync(saga, ct);
        return saga;
    }

    private async Task<OrderSaga> FailAsync(OrderSaga saga, string reason, CancellationToken ct)
    {
        saga = saga with { State = OrderSagaState.Failed };
        await _orders.CancelAsync(saga.OrderId, reason, ct);
        await _repo.SaveAsync(saga, ct);
        return saga;
    }
}

Step 3 — Each service exposes both forward and compensating endpoints

// Inventory service
[ApiController]
[Route("inventory")]
public class InventoryController(InventoryService svc) : ControllerBase
{
    [HttpPost("reserve")]
    public Task ReserveAsync(ReserveStockRequest req) => svc.ReserveAsync(req.OrderId, req.Items);

    [HttpPost("release")]
    public Task ReleaseAsync(ReleaseStockRequest req) => svc.ReleaseAsync(req.OrderId, req.Items);
}

The key insight: every "do X" endpoint has a paired "undo X" endpoint. That pairing is the SAGA contract.

Compensating transactions — the hard part

A compensation is not a database ROLLBACK. The original transaction already committed. Compensation is a new transaction that semantically reverses the prior effect.

Forward	Compensation
Reserve 2 units of stock	Release 2 units of stock
Charge ₹1,500 to card	Refund ₹1,500
Send "order confirmed" email	Send "order cancelled" email
Decrement loyalty points	Increment loyalty points

Important properties of good compensating transactions:

Idempotent — running it twice has the same effect as running it once. (Retries are inevitable.)
Always succeeds (or has its own retry policy) — a failed compensation leaks "ghost" reservations.
Can be safely delayed — sometimes the compensation runs minutes after the failure.
Audit-logged — you must be able to explain "this order was refunded because shipping failed at 14:32".

When SAGA pays off

Workflows spanning 3+ services, where you cannot pessimistically lock all of them
Long-running business processes (multi-second or multi-minute)
E-commerce checkout, travel booking (flight + hotel + car), insurance claims processing
When 2PC would lock too many resources during peak load
When you accept eventual consistency for a short window

When SAGA is the wrong tool

Workflows fully inside one bounded context — use a database transaction
Read-heavy operations — there's nothing to compensate
Strict "everything atomic, no observer ever sees intermediate state" requirements (banking core, ATM transactions) — use 2PC or rearchitect to a monolith
Workflows where compensation is impossible (irreversible side effects like physical shipment leaving a warehouse) — design to commit only AFTER the point of no return

Advantages

Scales horizontally — each service can be deployed, scaled, failed independently
No distributed locks — avoids the 2PC liveness traps
Resilient — a saga can resume mid-workflow after a crash (state is durable)
Audit trail by design — the saga state log IS the audit log
Polyglot friendly — each service can use its own database tech

Disadvantages

Eventual consistency window — readers may see intermediate states (order exists but not yet paid)
Compensation logic doubles your code — every "do X" needs an "undo X"
Some operations are not compensable — once a notification email is sent, you can't unsend it
Debugging is harder — failure trace spans many services and many minutes
Idempotency is non-negotiable — retries cause double-charges if not handled

Production checklist

Persist the saga state durably before each step. Crash recovery depends on it.
Idempotency keys on every external call — Payment.Charge(orderId, key=hash(saga, step)).
Timeouts on every step — a hung service should not pause the saga indefinitely.
Dead-letter queue for permanent failures — when even compensation fails, surface to ops.
Saga timeout — if a saga takes 24h, ops should know. Alert on stuck sagas.
Distributed tracing — every saga step must carry a saga_id in the trace context.
Observability — dashboards for saga durations, failure rates, compensation rates.

Pitfalls to watch for

Forgotten compensations. Adding a new forward step without its compensating step. Code review must catch this.
Cascading retries. A retry storm on a single service brings down others. Add circuit breakers.
Compensating transactions that fail. Have a retry policy + manual review queue.
Choreography spaghetti. When the workflow grows past 3 steps, refactor to orchestration.
State machine drift. Orchestrator code and saga state schema fall out of sync. Add tests.

SAGA vs 2PC vs Eventual Consistency — quick decision

2PC (Two-Phase Commit): strong consistency, terrible scalability. Use only when business requires it AND you control all participants.
SAGA: eventual consistency with explicit recovery. Use for most distributed business workflows.
Naive eventual consistency (publish and pray): no recovery, no audit. Don't use for money or inventory.

Summary

The SAGA pattern is the practical answer to distributed transactions at microservices scale. You accept that the system passes through intermediate states, and you build explicit compensation paths to recover from any failure.

Start with Choreography if the workflow is small and stable. Move to Orchestration as it grows. Treat compensating transactions as first-class code — write them, test them, monitor them. Persist saga state durably and always carry an idempotency key.

When implemented well, sagas give you the same business guarantees as monolith transactions but with the scaling, deployment, and fault-isolation benefits of microservices. When implemented poorly, you get partial orders, lost money, and customer support tickets.

📚 Test your knowledge → Practice with our SAGA pattern interview questions — common scenarios, code traps, design trade-offs, and production gotchas.

This guide is the complete picture: why SAGA exists, the two implementation flavors, an order-processing walkthrough with real .NET code, the operational reality, and where this pattern fails.

Why SAGA exists

In a monolith, "place an order" looks like this:

BEGIN;
  INSERT INTO orders (...) VALUES (...);
  UPDATE inventory SET stock = stock - 1 WHERE sku = 'X';
  INSERT INTO payments (...) VALUES (...);
  UPDATE customers SET loyalty_points = loyalty_points + 10 WHERE id = ...;
COMMIT;

One database, one transaction. Atomic. Either all rows commit or none.

In microservices, every service owns its own database. Cross-service queries are forbidden. So "place an order" becomes:

Order Service        → INSERT INTO orders ...
Inventory Service    → UPDATE stock ...
Payment Service      → INSERT INTO payments ...
Customer Service     → UPDATE loyalty_points ...

Architecture in one diagram

                              SAGA — Order Processing
                              ───────────────────────

   ┌────────────┐ create   ┌────────────┐ reserve  ┌────────────┐ charge   ┌────────────┐
   │            │  order   │            │  stock   │            │  card    │            │
   │  Order     │─────────▶│ Inventory  │─────────▶│  Payment   │─────────▶│  Shipping  │
   │  Service   │          │  Service   │          │  Service   │          │  Service   │
   │            │          │            │          │            │          │            │
   └────────────┘          └────────────┘          └────────────┘          └────────────┘
         │                       │                       │                       │
         │                       │                       │                       │
         ▼ on failure            ▼ on failure            ▼ on failure            ▼ ✓ done
   ┌────────────┐          ┌────────────┐          ┌────────────┐
   │  cancel    │          │  release   │          │  refund    │     COMPENSATING
   │  order     │◀─────────│  stock     │◀─────────│  payment   │     TRANSACTIONS
   │            │          │            │          │            │
   └────────────┘          └────────────┘          └────────────┘
                                                                              
   forward path (left → right): each service does its local work, then triggers the next.
   reverse path (right → left): if any step fails, prior steps are undone in reverse order.

Three things to internalize:

Each step is a local ACID transaction in one database. No distributed lock.
Each forward step has a defined compensating step. "Reserved 1 unit" has a compensation: "release 1 unit".
Compensations run in reverse order. If payment fails, you release stock THEN cancel the order — same order the forward path took, reversed.

The two flavors — Choreography vs Orchestration

Choreography (event-driven, decentralized)

Each service publishes events. Other services listen and react.

Order Service                    Event Bus                   Inventory Service
     │                                │                              │
     │ ─OrderPlaced───────────────────▶                              │
     │                                │ ──OrderPlaced──────────────▶ │
     │                                │                              │ reserves stock
     │                                │ ◀──StockReserved──────────── │
     │                                │ ──StockReserved────────────▶ │ (to Payment)
     │                                │

No central coordinator. Every service knows its part. Loose coupling, but the workflow is invisible — to understand "place an order" end-to-end you have to read N service codebases.

Orchestration (central coordinator)

A dedicated SAGA orchestrator (often a state machine) sends commands and tracks state.

                ┌───────────────────┐
                │  Order Saga       │
                │  Orchestrator     │
                │  (state machine)  │
                └─┬───────┬───┬─────┘
                  │       │   │
       reserve───▶│       │   │◀───stockReserved
       stock      │       │   │
                  ▼       ▼   ▼
            ┌────────┐ ┌──────┐ ┌─────────┐
            │ Order  │ │ Inv. │ │ Payment │
            └────────┘ └──────┘ └─────────┘

One service is the source of truth for the workflow. Easier to reason about, easier to monitor. But it becomes a single point of complexity.

Which to pick

Decision factor	Choreography	Orchestration
Workflow visibility	Low — distributed	High — single state machine
Coupling	Loose	Tighter (orchestrator knows everyone)
Best when	2-3 steps, stable workflow	4+ steps, evolving workflow
Failure tracing	Hard (multi-service logs)	Easy (one log timeline)
Adding a new step	New event subscription per service	Edit the orchestrator state machine

Rule of thumb: start with Choreography for simple workflows. As complexity grows past ~3 steps, migrate to Orchestration.

Order processing — full code (Orchestration flavor, .NET)

Step 1 — Define the saga state and events

public enum OrderSagaState
{
    Started,
    StockReserved,
    PaymentCharged,
    ShippingScheduled,
    Completed,
    Failed,
    Compensating,
    Compensated
}

public record OrderSaga(
    Guid SagaId,
    Guid OrderId,
    Guid CustomerId,
    List<LineItem> Items,
    decimal Total,
    OrderSagaState State,
    DateTimeOffset StartedAt
);

Step 2 — The orchestrator

public class OrderSagaOrchestrator
{
    private readonly ISagaRepository _repo;
    private readonly IInventoryClient _inventory;
    private readonly IPaymentClient _payments;
    private readonly IShippingClient _shipping;
    private readonly IOrderClient _orders;
    private readonly ILogger<OrderSagaOrchestrator> _log;

    public async Task<OrderSaga> StartAsync(PlaceOrderCommand cmd, CancellationToken ct)
    {
        var saga = new OrderSaga(
            SagaId: Guid.NewGuid(),
            OrderId: Guid.NewGuid(),
            CustomerId: cmd.CustomerId,
            Items: cmd.Items,
            Total: cmd.Items.Sum(i => i.Price * i.Quantity),
            State: OrderSagaState.Started,
            StartedAt: DateTimeOffset.UtcNow);

        await _repo.SaveAsync(saga, ct);
        return await StepReserveStockAsync(saga, ct);
    }

    private async Task<OrderSaga> StepReserveStockAsync(OrderSaga saga, CancellationToken ct)
    {
        try
        {
            await _inventory.ReserveAsync(saga.OrderId, saga.Items, ct);
            saga = saga with { State = OrderSagaState.StockReserved };
            await _repo.SaveAsync(saga, ct);
            return await StepChargePaymentAsync(saga, ct);
        }
        catch (Exception ex)
        {
            _log.LogWarning(ex, "stock reservation failed for saga {SagaId}", saga.SagaId);
            return await FailAsync(saga, "stock_unavailable", ct);
        }
    }

    private async Task<OrderSaga> StepChargePaymentAsync(OrderSaga saga, CancellationToken ct)
    {
        try
        {
            await _payments.ChargeAsync(saga.OrderId, saga.CustomerId, saga.Total, ct);
            saga = saga with { State = OrderSagaState.PaymentCharged };
            await _repo.SaveAsync(saga, ct);
            return await StepScheduleShippingAsync(saga, ct);
        }
        catch (Exception ex)
        {
            _log.LogWarning(ex, "payment failed for saga {SagaId}", saga.SagaId);
            return await CompensateAsync(saga, "payment_declined", ct);
        }
    }

    private async Task<OrderSaga> StepScheduleShippingAsync(OrderSaga saga, CancellationToken ct)
    {
        try
        {
            await _shipping.ScheduleAsync(saga.OrderId, ct);
            saga = saga with { State = OrderSagaState.Completed };
            await _orders.MarkCompletedAsync(saga.OrderId, ct);
            await _repo.SaveAsync(saga, ct);
            return saga;
        }
        catch (Exception ex)
        {
            _log.LogWarning(ex, "shipping failed for saga {SagaId}", saga.SagaId);
            return await CompensateAsync(saga, "shipping_failed", ct);
        }
    }

    private async Task<OrderSaga> CompensateAsync(OrderSaga saga, string reason, CancellationToken ct)
    {
        saga = saga with { State = OrderSagaState.Compensating };
        await _repo.SaveAsync(saga, ct);

        // Reverse order: payment refund → stock release → order cancel
        if (saga.State.HasReached(OrderSagaState.PaymentCharged))
            await _payments.RefundAsync(saga.OrderId, saga.Total, ct);

        if (saga.State.HasReached(OrderSagaState.StockReserved))
            await _inventory.ReleaseAsync(saga.OrderId, saga.Items, ct);

        await _orders.CancelAsync(saga.OrderId, reason, ct);
        saga = saga with { State = OrderSagaState.Compensated };
        await _repo.SaveAsync(saga, ct);
        return saga;
    }

    private async Task<OrderSaga> FailAsync(OrderSaga saga, string reason, CancellationToken ct)
    {
        saga = saga with { State = OrderSagaState.Failed };
        await _orders.CancelAsync(saga.OrderId, reason, ct);
        await _repo.SaveAsync(saga, ct);
        return saga;
    }
}

Step 3 — Each service exposes both forward and compensating endpoints

// Inventory service
[ApiController]
[Route("inventory")]
public class InventoryController(InventoryService svc) : ControllerBase
{
    [HttpPost("reserve")]
    public Task ReserveAsync(ReserveStockRequest req) => svc.ReserveAsync(req.OrderId, req.Items);

    [HttpPost("release")]
    public Task ReleaseAsync(ReleaseStockRequest req) => svc.ReleaseAsync(req.OrderId, req.Items);
}

The key insight: every "do X" endpoint has a paired "undo X" endpoint. That pairing is the SAGA contract.

Compensating transactions — the hard part

A compensation is not a database ROLLBACK. The original transaction already committed. Compensation is a new transaction that semantically reverses the prior effect.

Forward	Compensation
Reserve 2 units of stock	Release 2 units of stock
Charge ₹1,500 to card	Refund ₹1,500
Send "order confirmed" email	Send "order cancelled" email
Decrement loyalty points	Increment loyalty points

Important properties of good compensating transactions:

Idempotent — running it twice has the same effect as running it once. (Retries are inevitable.)
Always succeeds (or has its own retry policy) — a failed compensation leaks "ghost" reservations.
Can be safely delayed — sometimes the compensation runs minutes after the failure.
Audit-logged — you must be able to explain "this order was refunded because shipping failed at 14:32".

When SAGA pays off

Workflows spanning 3+ services, where you cannot pessimistically lock all of them
Long-running business processes (multi-second or multi-minute)
E-commerce checkout, travel booking (flight + hotel + car), insurance claims processing
When 2PC would lock too many resources during peak load
When you accept eventual consistency for a short window

When SAGA is the wrong tool

Workflows fully inside one bounded context — use a database transaction
Read-heavy operations — there's nothing to compensate
Strict "everything atomic, no observer ever sees intermediate state" requirements (banking core, ATM transactions) — use 2PC or rearchitect to a monolith
Workflows where compensation is impossible (irreversible side effects like physical shipment leaving a warehouse) — design to commit only AFTER the point of no return

Advantages

Scales horizontally — each service can be deployed, scaled, failed independently
No distributed locks — avoids the 2PC liveness traps
Resilient — a saga can resume mid-workflow after a crash (state is durable)
Audit trail by design — the saga state log IS the audit log
Polyglot friendly — each service can use its own database tech

Disadvantages

Eventual consistency window — readers may see intermediate states (order exists but not yet paid)
Compensation logic doubles your code — every "do X" needs an "undo X"
Some operations are not compensable — once a notification email is sent, you can't unsend it
Debugging is harder — failure trace spans many services and many minutes
Idempotency is non-negotiable — retries cause double-charges if not handled

Production checklist

Persist the saga state durably before each step. Crash recovery depends on it.
Idempotency keys on every external call — Payment.Charge(orderId, key=hash(saga, step)).
Timeouts on every step — a hung service should not pause the saga indefinitely.
Dead-letter queue for permanent failures — when even compensation fails, surface to ops.
Saga timeout — if a saga takes 24h, ops should know. Alert on stuck sagas.
Distributed tracing — every saga step must carry a saga_id in the trace context.
Observability — dashboards for saga durations, failure rates, compensation rates.

Pitfalls to watch for

Forgotten compensations. Adding a new forward step without its compensating step. Code review must catch this.
Cascading retries. A retry storm on a single service brings down others. Add circuit breakers.
Compensating transactions that fail. Have a retry policy + manual review queue.
Choreography spaghetti. When the workflow grows past 3 steps, refactor to orchestration.
State machine drift. Orchestrator code and saga state schema fall out of sync. Add tests.

SAGA vs 2PC vs Eventual Consistency — quick decision

2PC (Two-Phase Commit): strong consistency, terrible scalability. Use only when business requires it AND you control all participants.
SAGA: eventual consistency with explicit recovery. Use for most distributed business workflows.
Naive eventual consistency (publish and pray): no recovery, no audit. Don't use for money or inventory.

Summary

📚 Test your knowledge → Practice with our SAGA pattern interview questions — common scenarios, code traps, design trade-offs, and production gotchas.

Get the next issue

Keep reading

Get the next issue

Keep reading