System DesignMedium
Synchronous vs asynchronous communication in microservices — REST, gRPC, or queues?
Choose by coupling tolerance and latency budget, not by what the team is most comfortable with.
| Pattern | Latency | Coupling | Failure model | Use when |
|---|---|---|---|---|
| REST (HTTP/JSON) | High | Tight | Caller sees errors | Public APIs, low-frequency calls |
| gRPC (HTTP/2, protobuf) | Low | Tight | Caller sees errors | Internal service-to-service, schemas matter |
| Message queue (RabbitMQ, SQS) | Async | Loose | Producer fire-and-forget | Workflows, retries, smoothing spikes |
| Event bus (Kafka, EventBridge) | Async | Very loose | Pub/sub fan-out | One event many consumers, audit log |
Decision flow
- Does the caller need the answer to continue? Yes — sync (REST/gRPC). No — async (queue/event).
- Is the consumer always available? No — async; the queue buffers the unavailability.
- Is the interaction 1-to-many? Yes — event bus. 1-to-1 — queue is enough.
- Strict ordering needed? Yes — Kafka partition or single-consumer SQS FIFO.
Anti-patterns
- Sync chains 4 deep. A calls B calls C calls D. If any one is down, A fails. Each adds latency. Break it: A calls B, B publishes events, downstream reacts.
- Choreography with no view. A web of events with no central understanding of the workflow. Add a saga coordinator (Temporal, MassTransit Saga) once you have 3+ steps.
- Queue used as RPC. Producer puts a message + correlation ID, waits on a reply queue. You re-invented sync with worse error handling. Use gRPC.
Code — gRPC sync
service Inventory {
rpc Reserve(ReserveRequest) returns (ReserveResponse);
}
var reply = await client.ReserveAsync(new ReserveRequest { Sku = "X", Qty = 2 });
Code — async outbox + event bus
// In the same DB transaction as the business write
await tx.ExecuteAsync(
"INSERT INTO outbox (topic, payload) VALUES ('order.placed', @p)",
new { p = JsonSerializer.Serialize(orderEvent) });
// A relay polls outbox and publishes to Kafka/SQS — exactly-once semantics
Real-world rule
- New microservice? Start sync (gRPC) for clarity.
- Hit your first "everything cascaded down" outage? Migrate the highest-fanout edge to async events.