System DesignMedium
API Gateway vs Service Mesh — what is the difference?
Both are infrastructure layers around your services, but they sit at different boundaries and solve different problems.
API Gateway — north/south traffic (external clients ↔ your system).
Responsibilities:
- Single entry point for clients (browsers, mobile)
- Authentication / authorization at the edge
- Rate limiting per API key
- Request routing to back-end services
- Aggregation (one client request → multiple service calls)
- Response transformation / version mediation
Examples: Azure API Management, AWS API Gateway, Kong, Nginx, Envoy.
# Typical config
routes:
- path: /api/orders/**
upstream: order-service
auth: jwt
rate_limit: 100/min
- path: /api/customers/**
upstream: customer-service
Service Mesh — east/west traffic (service ↔ service inside the cluster).
Responsibilities:
- mTLS between every pod automatically
- Retry / circuit-breaker between services
- Traffic shifting for canary releases (10% to v2)
- Per-service observability (metrics, traces) without code changes
- Service discovery + load balancing
Examples: Istio, Linkerd, Cilium, Consul Connect.
How it works: a sidecar proxy (Envoy, typically) runs next to each service. All in-cluster traffic flows through the proxies. Your service code doesn't know the mesh exists.
Comparison table:
| Concern | API Gateway | Service Mesh |
|---|---|---|
| Traffic direction | External → internal | Internal → internal |
| Auth | API key / JWT | mTLS service identity |
| Rate limit | Per-client | Per-service |
| Retry / circuit-break | Optional | Built-in |
| Observability | Edge-only | All service hops |
| Deployment cost | Low | High (sidecar per pod) |
Most teams need only one:
- Start with an API gateway. Solves the immediate problem of exposing services to clients.
- Adopt a service mesh only when you have 10+ services with serious operational pain (mTLS rollout, canary deploys, cross-service tracing). Below that, the sidecar overhead is hard to justify.