System Design Article
Reverse Proxy & API Gateway
Difficulty: Medium
A reverse proxy sits at the edge of your infrastructure and terminates client connections so backends never see them directly. An API gateway is a reverse proxy with opinions: authentication, rate limiting, request transformation, and per-route policies. This lesson covers what each does, when one is enough and when you need the other, the canonical features (TLS termination, response caching, request shaping, JWT validation, circuit breaking), and the tools that implement them (NGINX, Envoy, Kong, AWS API Gateway, Apigee). By the end you can place either in a real architecture and articulate the boundary between them in an interview.
Reverse Proxy & API Gateway
A reverse proxy sits at the edge of your infrastructure and terminates client connections so backends never see them directly. An API gateway is a reverse proxy with opinions: authentication, rate limiting, request transformation, and per-route policies. This lesson covers what each does, when one is enough and when you need the other, the canonical features (TLS termination, response caching, request shaping, JWT validation, circuit breaking), and the tools that implement them (NGINX, Envoy, Kong, AWS API Gateway, Apigee). By the end you can place either in a real architecture and articulate the boundary between them in an interview.
1,142 views
21
Forward Proxy vs Reverse Proxy
The word 'proxy' alone is ambiguous. The distinction is which side it represents.
Forward proxy
Sits in front of clients. Clients send requests through the proxy, which forwards them to the internet. Examples: corporate web filters, residential VPNs, a Squid cache for a school network.
Reverse proxy
Sits in front of servers. Clients connect to the proxy thinking it is the server. The proxy forwards the request to the actual backend. The client never sees the backend's IP.
---------- Forward vs reverse proxy ----------
Forward proxy (per-client):
[ user 1 ] -> [ corp proxy ] -> [ open internet ] -> [ external sites ]
[ user 2 ] -> [ corp proxy ]
Reverse proxy (per-server-fleet):
[ user ] -> [ reverse proxy ] -> [ b1 ] [ b2 ] [ b3 ]The rest of this lesson is about reverse proxies.
What a Reverse Proxy Does
A reverse proxy is the right place for any concern that is the same across every backend.
1. TLS termination
The proxy holds the TLS certificate; the client negotiates HTTPS with the proxy; the connection from proxy to backend is plain HTTP (or mTLS inside the cluster). Backends do not need their own certificates, do not pay the TLS handshake cost, and are not exposed publicly.
---------- TLS termination at the edge ----------
client - HTTPS (TLS 1.3) --> [ reverse proxy ] - HTTP --> [ backend ]2. Compression
The proxy gzips or brotlis responses before sending them to the client. Backends emit raw responses; the proxy compresses once. Saves backend CPU; centralizes the compression policy.
3. Response caching
For cacheable responses (per Cache-Control headers), the proxy stores them and serves subsequent requests from cache without touching the backend. NGINX, Varnish, and Cloudflare all do this.
4. Request buffering
A slow client sending a 10 MB body byte-by-byte would tie up a backend worker for minutes. The proxy buffers the entire request in memory or on disk and only forwards it to the backend once complete. Same in reverse for slow clients reading responses.
5. IP allowlists / blocklists, geo-blocking
The proxy is the natural place to drop traffic from unwanted IP ranges, abusive ASNs, or sanctioned countries.
6. Header manipulation
Add X-Forwarded-For, X-Request-ID, security headers (Strict-Transport-Security, Content-Security-Policy); strip server-internal headers from responses.
NGINX example: reverse proxy with TLS termination and gzip
---------- NGINX reverse proxy ----------
server {
listen 443 ssl http2;
server_name api.example.com;
ssl_certificate /etc/ssl/api.example.com.crt;
ssl_certificate_key /etc/ssl/api.example.com.key;
gzip on;
gzip_types application/json text/html;
location / {
proxy_pass http://backend_pool;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto https;
proxy_set_header Host $host;
}
}That is the canonical reverse-proxy config. Most websites' edge starts here.
API Gateway: Reverse Proxy with Opinions
An API gateway is a reverse proxy with first-class support for the cross-cutting concerns specific to APIs. Think of it as 'reverse proxy + auth + per-route policy + transformation'.
Capabilities a gateway adds
| Capability | What it does | Why centralize it |
|---|---|---|
| Authentication | Validate JWT, API keys, OAuth tokens before the request reaches the backend | Every backend would otherwise need the verification code; one bug = N backends vulnerable |
| Authorization | Check scopes/roles per route | Same as above; consistent enforcement |
| Rate limiting | Per-key, per-IP, per-route quotas | Backends do not need their own quota state; gateway is the natural choke point |
| Request transformation | Rewrite paths, add/strip headers, convert REST to gRPC | Lets you change backend interfaces without breaking clients |
| Response aggregation | Fan out one client request to N backends, merge responses | The backend-for-frontend (BFF) pattern; lets the client make one call instead of three |
| Versioning | Route /v1 to the legacy fleet, /v2 to the new one | Gradual migration with no client changes |
| Service discovery | Look up backend addresses dynamically (Consul, Kubernetes services) | Backends can scale and move without DNS or config changes |
| Circuit breaking | Stop sending requests to a backend that is failing | Prevents one slow backend from cascading into total outage |
| Observability | Centralized request logs, traces, metrics | Single source of truth across all services |
Architecture diagram
---------- API gateway in front of microservices ----------
client - HTTPS --> [ API gateway ]
| - TLS termination
| - JWT validation
| - rate limiting
| - request rewriting
v
+---------+---------+---------+----------+
v v v v v
[ users ] [ orders ] [ catalog ] [ search ] [ payments ]Each service is small, language-agnostic, and free of auth/rate-limit boilerplate. The gateway is the single declarative description of every public API.
Per-route policy example (Kong style YAML)
routes:
- name: list-orders
paths: ['/v1/orders']
methods: ['GET']
service: orders
plugins:
- name: jwt
config:
key_claim_name: kid
- name: rate-limiting
config:
minute: 60
policy: redis
- name: response-cache
config:
ttl: 30Readable and reviewable; no application code changes needed to add or change a policy.
Backend-for-Frontend (BFF)
A BFF is an API gateway tailored to a specific client (web, mobile, partners). The same backends serve all clients, but each gateway transforms responses and aggregates requests differently.
---------- BFF pattern ----------
[ web app ] -> [ web BFF ] -> orders, users, catalog
[ iOS app ] -> [ mobile BFF ] -> orders, users, catalog (smaller payloads, image variants)
[ partner ] -> [ partner BFF ] -> orders only (rate-limited to 100/min)Not every system needs BFFs - they add operational surface - but for products with multiple very different clients (Spotify, LinkedIn), it is a clean way to keep core services general while each client gets a tailored interface.
Edge vs Service: Where Logic Belongs
The most important judgment call in any gateway design is the boundary: what lives at the edge, what stays in the service?
At the edge (gateway)
- Authentication (JWT validation, mTLS)
- Coarse authorization (scope check)
- Rate limiting per API key
- Request shaping (path rewrite, header rewrite)
- TLS, compression, response caching
- Cross-cutting observability
These concerns are the same regardless of which service handles the request. Centralizing them eliminates duplication.
In the service
- Business logic
- Fine-grained authorization ('can user 42 edit document 99?')
- Domain-specific validation
- Workflow orchestration (when complex enough)
- Per-tenant data partitioning
These concerns require domain knowledge; the gateway should not know what a document is.
The fat-gateway anti-pattern
When the gateway starts knowing about business rules ('the orders gateway aggregates inventory + pricing + cart + recommendations and applies a coupon'), you have re-created a monolith with a different name. Now every product change requires gateway changes; the gateway team becomes a release bottleneck; backend services lose autonomy.
The rule: the gateway transforms requests; it does not understand them. If you find yourself adding domain logic to the gateway, that logic belongs in a backend service or a dedicated 'aggregator' service that itself sits behind the gateway.
Tool Selection
| Tool | Best for | Notable features |
|---|---|---|
| NGINX | Edge reverse proxy, TLS termination, simple routing | Battle-tested, minimal memory footprint, declarative config |
| HAProxy | Layer 4/7 LB with deep tuning | Advanced ACLs, observability via stats endpoint |
| Envoy | Service mesh data plane, API gateway | Dynamic config via xDS, gRPC-native, rich observability |
| Kong | Open-source API gateway | Plugin ecosystem, YAML-driven, runs on top of NGINX |
| Traefik | Cloud-native gateway | Auto-discovery from Kubernetes, Docker, Consul |
| AWS API Gateway | Managed gateway for AWS services | Pay-per-request, integrates with Lambda, IAM auth |
| Apigee (Google) | Enterprise API platform | API products, developer portal, monetization |
| Cloudflare | Edge gateway with CDN/DDoS/WAF | Anycast network, no infra to run, programmable workers |
Default recommendations:
- Single small service: NGINX is enough.
- Microservices on Kubernetes: Envoy (Istio/Linkerd as the mesh) or Traefik.
- Public API for paying customers: Kong (self-hosted) or AWS API Gateway / Apigee (managed).
- DDoS / WAF / global cache: put Cloudflare in front of any of the above.
Routing in Action: A Request's Journey
Follow a real request from browser to database through a typical gateway-fronted system.
---------- Request lifecycle ----------
T0 POST https://api.example.com/v1/orders (with JWT, body 5 KB)
T1 Cloudflare edge:
- terminates TLS
- WAF rule check
- origin selection (anycast picks the nearest POP)
T2 Origin reverse proxy (NGINX):
- terminates the upstream TLS
- logs the request
- forwards to the API gateway
T3 API gateway (Kong):
- validates JWT (cached public key)
- checks rate limit: 100/min per user-id
- looks up route '/v1/orders' POST -> orders service
- adds X-Request-ID, X-User-Id headers
- passes to orders service
T4 Orders service:
- validates body against schema
- fine-grained auth: user can place order
- writes order to Postgres
- emits OrderPlaced event to Kafka
- returns 201 Created
T5 Response travels back the same chain;
gateway logs duration; proxy compresses; Cloudflare caches if cacheable.Each layer has one job and is configured independently. Failure of any layer can be retried at the layer above.
Calling a Gateway-Protected API from a Client
async function placeOrder(token, order) {
const res = await fetch('https://api.example.com/v1/orders', {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json',
'X-Idempotency-Key': crypto.randomUUID(),
},
body: JSON.stringify(order),
});
if (res.status === 429) {
const retryAfter = Number(res.headers.get('Retry-After') ?? 1);
await new Promise((r) => setTimeout(r, retryAfter * 1000));
return placeOrder(token, order);
}
if (!res.ok) throw new Error(`HTTP ${res.status}`);
return await res.json();
}Notice the rate-limit handling: the gateway returns 429 with a Retry-After header, and the client respects it. This pattern is enforced at the edge precisely so every backend does not need to implement it.
Real-World Examples
How real systems implement this in production
Netflix originally built Zuul as a Java-based API gateway in front of thousands of microservices. It handles auth, dynamic routing, request shaping, and traffic shifting (canary deploys). Newer services use Spring Cloud Gateway (also Netty-based).
Trade-off: At thousands of services, a programmable gateway is essential for canary deploys, A/B tests, and emergency traffic shifting.
Stripe runs a multi-region API gateway that handles authentication, rate limiting, request validation, and idempotency-key deduplication before routing to backend services. They publish strict per-key rate limits (e.g., 100 reads/sec per API key) enforced at the gateway, which means every backend service can assume its input is well-formed.
Trade-off: Gateway-enforced contracts (rate limits, idempotency) are how you make a public API robust.
A Kubernetes cluster typically has an Ingress controller (NGINX or Traefik) handling external traffic, and an Istio mesh handling east-west service-to-service traffic. Each Pod has an Envoy sidecar; the Istio control plane pushes routing and policy. The Ingress is the gateway; the sidecars are micro-gateways.
Trade-off: In modern cloud-native architectures, gateway functionality is distributed across an edge tier and a sidecar tier.
Cloudflare lets you write JavaScript or Rust code that runs at every edge POP and processes every request before it reaches the origin. A Worker can validate JWTs, rewrite URLs, do A/B routing, or short-circuit rate-limited requests - all without a separate gateway tier.
Trade-off: Edge compute is collapsing the gateway and the CDN into one programmable layer; for many use cases, you no longer need a dedicated gateway tier inside the datacenter.
Quick Interview Phrases
Key terms to use in your answer
Common Interview Questions
Questions you might be asked about this topic
A reverse proxy is a generic transport-layer middleman: TLS termination, compression, response caching, request buffering, basic routing. An API gateway adds API-aware policies: authentication (JWT, API keys), authorization, per-route rate limiting, request/response transformation, service discovery, circuit breaking. Every API gateway is a reverse proxy plus opinions; not every reverse proxy is a gateway. Reverse proxy: NGINX, HAProxy. Gateway: Kong, AWS API Gateway, Apigee, Envoy with xDS.
Cloudflare or Fastly at the very edge: anycast routing, TLS termination, WAF, DDoS mitigation, basic caching of static content. Per-region API gateway (Kong or AWS API Gateway): JWT validation against a regional Redis-cached JWK, rate limiting per API key (Redis-backed sliding window), per-route plugins, observability. Behind the gateway: regional load balancer (ALB) routing to a Kubernetes cluster running the microservices. Each service has its own auto-scaling group. Mention monitoring: gateway p99 latency, 4xx/5xx rates per route, rate-limit-rejected counters, backend health.
The gateway holds the public key (JWK) for the token issuer, fetched once and cached. Each incoming request: parse the Authorization header, verify the signature in process (no remote call), check exp, iat, iss, aud claims, optionally check a revocation cache (Redis). If valid, attach user context as headers (X-User-Id, X-Tenant) and pass to the service. The expensive part (signature crypto) is fast in C/C++ implementations like Envoy or Kong. For scopes/roles, the gateway can do a Redis lookup or trust the JWT's claims. Mention key rotation: refresh the JWK every few minutes so a key rotation does not break authentication.
When you have multiple very different clients (web, mobile, partners) with different needs - response shape, payload size, aggregation. A BFF gives each client its own tailored gateway that aggregates from the same core services. Mobile BFF returns smaller payloads and stripped-down image variants; web BFF returns full payloads with richer metadata; partner BFF rate-limits aggressively and exposes a smaller subset of routes. The cost is operational - each BFF is another deploy unit. Worth it when client divergence is high; not worth it for a single client.
Horizontal scaling: run many gateway instances behind a load balancer. Multi-AZ deployment so a zone failure leaves the gateway up. Multi-region with anycast or GeoDNS for region failure. Cache config locally so a control-plane outage does not break the data plane. Stateless gateway processes so any instance can serve any request. Health checks at every layer. Circuit breakers so a backend failure does not propagate. Monitoring with paging on gateway error rate, p99 latency, and config-sync lag. Mention chaos testing - kill gateway nodes regularly so failover paths are exercised.
Interview Tips
How to discuss this topic effectively
State the gateway boundary explicitly: 'auth, rate limit, and TLS at the gateway; business logic and fine-grained authorization in the service'. That clarity is the senior-level answer.
Name the tool by its strength: 'NGINX for the edge reverse proxy, Kong or Envoy as the API gateway in front of microservices, Cloudflare Workers for edge compute'. Concrete tool names beat abstract patterns.
Mention the BFF pattern when the question involves multiple very different clients (web/mobile/partners). It is a clean way to differentiate clients without forking core services.
Always pair gateway with circuit breaker and rate limiter - all three are normally configured together. Forgetting circuit breaking is the rookie miss.
Watch for the 'fat gateway' trap. The moment your design has the gateway making business decisions, move that logic into a backend service or aggregator behind the gateway.
Common Mistakes
Pitfalls to avoid in interviews
Putting business logic in the API gateway
The gateway should transform and validate requests, not understand them. Once it knows about domain entities (orders, products, users), every product change requires gateway changes and the gateway team becomes a release bottleneck. Keep domain logic in services; use an aggregator service if you need to fan out and merge.
Treating the gateway as another service to deploy with code changes
A gateway should be config-driven, not code-driven. Routes, plugins, rate limits live in declarative YAML or a control-plane database. Application teams change routes by submitting config PRs, not by deploying gateway code.
Skipping TLS termination at the edge
Backends should not handle TLS - they pay the handshake cost, manage certificates, and become the public attack surface. Terminate TLS at the reverse proxy or gateway and use plain HTTP (or mTLS) inside the cluster.
Forgetting the gateway is a single point of failure
If the gateway goes down, the entire API is down. Run it horizontally scaled behind a load balancer, deploy to multiple AZs, and add an outer DNS or anycast layer for region-level failover. Cache critical config locally so a control-plane outage does not take down the data plane.
Putting fine-grained authorization in the gateway
The gateway can check coarse claims (scope, role) but cannot answer 'does user 42 own document 99?' without loading domain data. Fine-grained authorization belongs in the service that owns the resource. Centralizing it in the gateway leads to either a performance disaster or business logic bleeding upward.
