System Design Article

Reverse Proxy & API Gateway

Difficulty: Medium

A reverse proxy sits at the edge of your infrastructure and terminates client connections so backends never see them directly. An API gateway is a reverse proxy with opinions: authentication, rate limiting, request transformation, and per-route policies. This lesson covers what each does, when one is enough and when you need the other, the canonical features (TLS termination, response caching, request shaping, JWT validation, circuit breaking), and the tools that implement them (NGINX, Envoy, Kong, AWS API Gateway, Apigee). By the end you can place either in a real architecture and articulate the boundary between them in an interview.

Reverse Proxy & API Gateway

System Design

Medium

reverse-proxy

api-gateway

nginx

envoy

kong

tls

rate-limiting

system-design

intermediate

premium

1,142 views

Forward Proxy vs Reverse Proxy

The word 'proxy' alone is ambiguous. The distinction is which side it represents.

Forward proxy

Sits in front of clients. Clients send requests through the proxy, which forwards them to the internet. Examples: corporate web filters, residential VPNs, a Squid cache for a school network.

Reverse proxy

Sits in front of servers. Clients connect to the proxy thinking it is the server. The proxy forwards the request to the actual backend. The client never sees the backend's IP.

Text

---------- Forward vs reverse proxy ----------
  Forward proxy (per-client):                                                  
    [ user 1 ] -> [ corp proxy ] -> [ open internet ] -> [ external sites ]    
    [ user 2 ] -> [ corp proxy ]                                               
                                                                               
  Reverse proxy (per-server-fleet):                                            
    [ user ] -> [ reverse proxy ] -> [ b1 ] [ b2 ] [ b3 ]

The rest of this lesson is about reverse proxies.

What a Reverse Proxy Does

A reverse proxy is the right place for any concern that is the same across every backend.

1. TLS termination

The proxy holds the TLS certificate; the client negotiates HTTPS with the proxy; the connection from proxy to backend is plain HTTP (or mTLS inside the cluster). Backends do not need their own certificates, do not pay the TLS handshake cost, and are not exposed publicly.

Text

---------- TLS termination at the edge ----------
  client - HTTPS (TLS 1.3) --> [ reverse proxy ] - HTTP --> [ backend ]

2. Compression

The proxy gzips or brotlis responses before sending them to the client. Backends emit raw responses; the proxy compresses once. Saves backend CPU; centralizes the compression policy.

3. Response caching

For cacheable responses (per Cache-Control headers), the proxy stores them and serves subsequent requests from cache without touching the backend. NGINX, Varnish, and Cloudflare all do this.

4. Request buffering

A slow client sending a 10 MB body byte-by-byte would tie up a backend worker for minutes. The proxy buffers the entire request in memory or on disk and only forwards it to the backend once complete. Same in reverse for slow clients reading responses.

5. IP allowlists / blocklists, geo-blocking

The proxy is the natural place to drop traffic from unwanted IP ranges, abusive ASNs, or sanctioned countries.

6. Header manipulation

Add X-Forwarded-For, X-Request-ID, security headers (Strict-Transport-Security, Content-Security-Policy); strip server-internal headers from responses.

NGINX example: reverse proxy with TLS termination and gzip

Text

---------- NGINX reverse proxy ----------
server {
    listen 443 ssl http2;
    server_name api.example.com;
    ssl_certificate     /etc/ssl/api.example.com.crt;
    ssl_certificate_key /etc/ssl/api.example.com.key;

    gzip on;
    gzip_types application/json text/html;

    location / {
        proxy_pass http://backend_pool;
        proxy_set_header X-Forwarded-For  $remote_addr;
        proxy_set_header X-Forwarded-Proto https;
        proxy_set_header Host             $host;
    }
}

That is the canonical reverse-proxy config. Most websites' edge starts here.

API Gateway: Reverse Proxy with Opinions

An API gateway is a reverse proxy with first-class support for the cross-cutting concerns specific to APIs. Think of it as 'reverse proxy + auth + per-route policy + transformation'.

Capabilities a gateway adds

Capability	What it does	Why centralize it
Authentication	Validate JWT, API keys, OAuth tokens before the request reaches the backend	Every backend would otherwise need the verification code; one bug = N backends vulnerable
Authorization	Check scopes/roles per route	Same as above; consistent enforcement
Rate limiting	Per-key, per-IP, per-route quotas	Backends do not need their own quota state; gateway is the natural choke point
Request transformation	Rewrite paths, add/strip headers, convert REST to gRPC	Lets you change backend interfaces without breaking clients
Response aggregation	Fan out one client request to N backends, merge responses	The backend-for-frontend (BFF) pattern; lets the client make one call instead of three
Versioning	Route `/v1` to the legacy fleet, `/v2` to the new one	Gradual migration with no client changes
Service discovery	Look up backend addresses dynamically (Consul, Kubernetes services)	Backends can scale and move without DNS or config changes
Circuit breaking	Stop sending requests to a backend that is failing	Prevents one slow backend from cascading into total outage
Observability	Centralized request logs, traces, metrics	Single source of truth across all services

Architecture diagram

Text

---------- API gateway in front of microservices ----------
  client - HTTPS --> [ API gateway ]
                          |  - TLS termination
                          |  - JWT validation
                          |  - rate limiting
                          |  - request rewriting
                          v
     +---------+---------+---------+----------+
     v         v         v         v          v
  [ users ] [ orders ] [ catalog ] [ search ] [ payments ]

Each service is small, language-agnostic, and free of auth/rate-limit boilerplate. The gateway is the single declarative description of every public API.

Per-route policy example (Kong style YAML)

Yaml

routes:
    - name: list-orders
      paths: ['/v1/orders']
      methods: ['GET']
      service: orders
      plugins:
          - name: jwt
            config:
                key_claim_name: kid
          - name: rate-limiting
            config:
                minute: 60
                policy: redis
          - name: response-cache
            config:
                ttl: 30

Readable and reviewable; no application code changes needed to add or change a policy.

Backend-for-Frontend (BFF)

A BFF is an API gateway tailored to a specific client (web, mobile, partners). The same backends serve all clients, but each gateway transforms responses and aggregates requests differently.

Text

---------- BFF pattern ----------
  [ web app ] -> [ web BFF ]    -> orders, users, catalog
  [ iOS app ] -> [ mobile BFF ] -> orders, users, catalog (smaller payloads, image variants)
  [ partner ] -> [ partner BFF ] -> orders only (rate-limited to 100/min)

Not every system needs BFFs - they add operational surface - but for products with multiple very different clients (Spotify, LinkedIn), it is a clean way to keep core services general while each client gets a tailored interface.

Edge vs Service: Where Logic Belongs

The most important judgment call in any gateway design is the boundary: what lives at the edge, what stays in the service?

At the edge (gateway)

Authentication (JWT validation, mTLS)
Coarse authorization (scope check)
Rate limiting per API key
Request shaping (path rewrite, header rewrite)
TLS, compression, response caching
Cross-cutting observability

These concerns are the same regardless of which service handles the request. Centralizing them eliminates duplication.

In the service

Business logic
Fine-grained authorization ('can user 42 edit document 99?')
Domain-specific validation
Workflow orchestration (when complex enough)
Per-tenant data partitioning

These concerns require domain knowledge; the gateway should not know what a document is.

The fat-gateway anti-pattern

When the gateway starts knowing about business rules ('the orders gateway aggregates inventory + pricing + cart + recommendations and applies a coupon'), you have re-created a monolith with a different name. Now every product change requires gateway changes; the gateway team becomes a release bottleneck; backend services lose autonomy.

The rule: the gateway transforms requests; it does not understand them. If you find yourself adding domain logic to the gateway, that logic belongs in a backend service or a dedicated 'aggregator' service that itself sits behind the gateway.

Tool Selection

Tool	Best for	Notable features
NGINX	Edge reverse proxy, TLS termination, simple routing	Battle-tested, minimal memory footprint, declarative config
HAProxy	Layer 4/7 LB with deep tuning	Advanced ACLs, observability via stats endpoint
Envoy	Service mesh data plane, API gateway	Dynamic config via xDS, gRPC-native, rich observability
Kong	Open-source API gateway	Plugin ecosystem, YAML-driven, runs on top of NGINX
Traefik	Cloud-native gateway	Auto-discovery from Kubernetes, Docker, Consul
AWS API Gateway	Managed gateway for AWS services	Pay-per-request, integrates with Lambda, IAM auth
Apigee (Google)	Enterprise API platform	API products, developer portal, monetization
Cloudflare	Edge gateway with CDN/DDoS/WAF	Anycast network, no infra to run, programmable workers

Default recommendations:

Single small service: NGINX is enough.
Microservices on Kubernetes: Envoy (Istio/Linkerd as the mesh) or Traefik.
Public API for paying customers: Kong (self-hosted) or AWS API Gateway / Apigee (managed).
DDoS / WAF / global cache: put Cloudflare in front of any of the above.

Routing in Action: A Request's Journey

Follow a real request from browser to database through a typical gateway-fronted system.

Text

---------- Request lifecycle ----------
  T0   POST https://api.example.com/v1/orders  (with JWT, body 5 KB)

  T1   Cloudflare edge:                                                            
         - terminates TLS                                                          
         - WAF rule check                                                          
         - origin selection (anycast picks the nearest POP)                        

  T2   Origin reverse proxy (NGINX):                                               
         - terminates the upstream TLS                                             
         - logs the request                                                        
         - forwards to the API gateway                                             

  T3   API gateway (Kong):                                                         
         - validates JWT (cached public key)                                       
         - checks rate limit: 100/min per user-id                                  
         - looks up route '/v1/orders' POST -> orders service                      
         - adds X-Request-ID, X-User-Id headers                                    
         - passes to orders service                                                

  T4   Orders service:                                                             
         - validates body against schema                                           
         - fine-grained auth: user can place order                                 
         - writes order to Postgres                                                
         - emits OrderPlaced event to Kafka                                        
         - returns 201 Created                                                     

  T5   Response travels back the same chain;                                       
       gateway logs duration; proxy compresses; Cloudflare caches if cacheable.

Each layer has one job and is configured independently. Failure of any layer can be retried at the layer above.

Calling a Gateway-Protected API from a Client

JavaScript

Python

async function placeOrder(token, order) {
    const res = await fetch('https://api.example.com/v1/orders', {
        method: 'POST',
        headers: {
            'Authorization': `Bearer ${token}`,
            'Content-Type': 'application/json',
            'X-Idempotency-Key': crypto.randomUUID(),
        },
        body: JSON.stringify(order),
    });
    if (res.status === 429) {
        const retryAfter = Number(res.headers.get('Retry-After') ?? 1);
        await new Promise((r) => setTimeout(r, retryAfter * 1000));
        return placeOrder(token, order);
    }
    if (!res.ok) throw new Error(`HTTP ${res.status}`);
    return await res.json();
}

Notice the rate-limit handling: the gateway returns 429 with a Retry-After header, and the client respects it. This pattern is enforced at the edge precisely so every backend does not need to implement it.

Real-World Examples

How real systems implement this in production

Netflix Zuul / Spring Cloud Gateway

Netflix originally built Zuul as a Java-based API gateway in front of thousands of microservices. It handles auth, dynamic routing, request shaping, and traffic shifting (canary deploys). Newer services use Spring Cloud Gateway (also Netty-based).

Trade-off: At thousands of services, a programmable gateway is essential for canary deploys, A/B tests, and emergency traffic shifting.

Stripe API gateway

Stripe runs a multi-region API gateway that handles authentication, rate limiting, request validation, and idempotency-key deduplication before routing to backend services. They publish strict per-key rate limits (e.g., 100 reads/sec per API key) enforced at the gateway, which means every backend service can assume its input is well-formed.

Trade-off: Gateway-enforced contracts (rate limits, idempotency) are how you make a public API robust.

Kubernetes Ingress + Istio

A Kubernetes cluster typically has an Ingress controller (NGINX or Traefik) handling external traffic, and an Istio mesh handling east-west service-to-service traffic. Each Pod has an Envoy sidecar; the Istio control plane pushes routing and policy. The Ingress is the gateway; the sidecars are micro-gateways.

Trade-off: In modern cloud-native architectures, gateway functionality is distributed across an edge tier and a sidecar tier.

Cloudflare Workers as gateway logic

Cloudflare lets you write JavaScript or Rust code that runs at every edge POP and processes every request before it reaches the origin. A Worker can validate JWTs, rewrite URLs, do A/B routing, or short-circuit rate-limited requests - all without a separate gateway tier.

Trade-off: Edge compute is collapsing the gateway and the CDN into one programmable layer; for many use cases, you no longer need a dedicated gateway tier inside the datacenter.

Quick Interview Phrases

Key terms to use in your answer

TLS termination at the edge

API gateway pattern

backend-for-frontend (BFF)

rate limiting at the gateway

request transformation

circuit breaking

Common Interview Questions

Questions you might be asked about this topic

Explain the difference between a reverse proxy and an API gateway.

A reverse proxy is a generic transport-layer middleman: TLS termination, compression, response caching, request buffering, basic routing. An API gateway adds API-aware policies: authentication (JWT, API keys), authorization, per-route rate limiting, request/response transformation, service discovery, circuit breaking. Every API gateway is a reverse proxy plus opinions; not every reverse proxy is a gateway. Reverse proxy: NGINX, HAProxy. Gateway: Kong, AWS API Gateway, Apigee, Envoy with xDS.

Design the edge architecture for a public REST API serving 50K req/sec across three regions.

How does an API gateway handle the JWT validation flow without becoming a bottleneck?

When would you use a backend-for-frontend (BFF) pattern?

How do you keep a gateway from becoming a single point of failure?

Interview Tips

How to discuss this topic effectively

State the gateway boundary explicitly: 'auth, rate limit, and TLS at the gateway; business logic and fine-grained authorization in the service'. That clarity is the senior-level answer.

Name the tool by its strength: 'NGINX for the edge reverse proxy, Kong or Envoy as the API gateway in front of microservices, Cloudflare Workers for edge compute'. Concrete tool names beat abstract patterns.

Mention the BFF pattern when the question involves multiple very different clients (web/mobile/partners). It is a clean way to differentiate clients without forking core services.

Always pair gateway with circuit breaker and rate limiter - all three are normally configured together. Forgetting circuit breaking is the rookie miss.

Watch for the 'fat gateway' trap. The moment your design has the gateway making business decisions, move that logic into a backend service or aggregator behind the gateway.

Common Mistakes

Pitfalls to avoid in interviews

Putting business logic in the API gateway

The gateway should transform and validate requests, not understand them. Once it knows about domain entities (orders, products, users), every product change requires gateway changes and the gateway team becomes a release bottleneck. Keep domain logic in services; use an aggregator service if you need to fan out and merge.

Treating the gateway as another service to deploy with code changes

A gateway should be config-driven, not code-driven. Routes, plugins, rate limits live in declarative YAML or a control-plane database. Application teams change routes by submitting config PRs, not by deploying gateway code.

Skipping TLS termination at the edge

Backends should not handle TLS - they pay the handshake cost, manage certificates, and become the public attack surface. Terminate TLS at the reverse proxy or gateway and use plain HTTP (or mTLS) inside the cluster.

Forgetting the gateway is a single point of failure

If the gateway goes down, the entire API is down. Run it horizontally scaled behind a load balancer, deploy to multiple AZs, and add an outer DNS or anycast layer for region-level failover. Cache critical config locally so a control-plane outage does not take down the data plane.

Putting fine-grained authorization in the gateway

The gateway can check coarse claims (scope, role) but cannot answer 'does user 42 own document 99?' without loading domain data. Fine-grained authorization belongs in the service that owns the resource. Centralizing it in the gateway leads to either a performance disaster or business logic bleeding upward.

Back to System Design