Community Article

CDN 101: Edge Caches, Origin Shields, and Cache Keys

The cache key matters more than the TTL. Origin shield is a cheap config win. Most CDN incidents are key bugs, not capacity bugs.

CDN 101: Edge Caches, Origin Shields, and Cache Keys

The cache key matters more than the TTL. Origin shield is a cheap config win. Most CDN incidents are key bugs, not capacity bugs.

cdn

caching

origin-shield

http

system-design

By @nadiaali

December 10, 2025

Updated May 18, 2026

1,133 views

Rate

The CDN bug that taught me to take cache keys seriously: a marketing team's logged-in user dashboard was being cached and served to other logged-in users. Different users were seeing each other's profile data on the home page for the first three seconds after navigation, then JavaScript would fetch the right data and overwrite the screen. Three seconds of data leakage between accounts is enough to be a security incident. The root cause was a single line of CDN configuration: the cache key did not include the session cookie. From the CDN's perspective, every request to /dashboard was the same request, so the response from the first user got served to the next thousand.

We fixed it in five minutes (add the session cookie to the cache key, set Cache-Control: private on logged-in pages, deploy). The conversation that came out of it took five weeks: how does a CDN actually work, what is in a cache key, what is an origin shield, and why are the defaults dangerous? This article is the version of that conversation I would write for the next team.

My stance: the cache key is the most consequential CDN configuration value, more consequential than TTL or geographic distribution. The TTL controls how long a wrong answer persists; the cache key controls whether the answer is right in the first place. Most CDN incidents I have seen trace back to a cache key that did not include enough request properties to disambiguate users.

What a CDN actually does

A CDN is a network of cache servers distributed close to users. When a user requests an asset (an image, a JavaScript bundle, an HTML page), the request hits the nearest CDN edge node. If the node has a fresh copy in its cache, it serves that copy directly without contacting the origin (your application servers). If not, it fetches from the origin, stores the response, and serves it. Subsequent requests for the same asset within the TTL get served from cache.

The wins are real and well-known: lower latency for users (the cache is geographically close), lower load on origin servers, lower bandwidth costs. The hidden cost is the cache itself: every cached response is correct only if the cache key uniquely identifies the response. A bad cache key means user A's response gets served to user B.

The anatomy of a cache key

A cache key is the string the CDN uses to look up a cached response. By default it is something like host + path + query string. So https://example.com/dashboard?id=42 and https://example.com/dashboard?id=43 get different cache entries because the query strings differ.

What is missing from the default key:

Things commonly NOT in the default cache key
  - request headers (Cookie, Authorization, User-Agent, Accept-Language, etc.)
  - request method (GET vs HEAD)
  - some query parameters (depends on configuration)
  - device class (mobile vs desktop variants)

If your response varies by any of these and they are not in the cache key, the CDN will serve the wrong response. The dashboard incident was exactly this: the response varied by session cookie, but the cookie was not in the cache key.

The fix is to either include the cookie in the key (CloudFront's behavior policy, Cloudflare's cache rules, Fastly's VCL) or to bypass the cache for cookied requests entirely (Cache-Control: private tells the CDN not to cache).

A subtler form of the cache-key problem is query parameter normalization. /products?id=42&color=red and /products?color=red&id=42 are the same request semantically, but a naive cache key treats them as different entries. Worse, /products?id=42&utm_source=newsletter and /products?id=42&utm_source=ads are the same response (the marketing tracking parameter does not change the content), but separate cache entries again. Both forms hurt hit rate. The fix is query-parameter normalization in the CDN config: sort the parameters alphabetically, drop irrelevant tracking parameters before computing the key, and treat empty values consistently. Most CDNs support this with a few lines of configuration. Without it, your hit rate can be 30% lower than it should be on URL-heavy traffic.

The Vary header: a weak gesture toward correctness

HTTP has a built-in mechanism for this: the Vary response header. Vary: Accept-Language tells caches that the response varies by the Accept-Language request header, so the cache should treat the same URL with different Accept-Language values as different entries.

HTTP/1.1 200 OK
Content-Type: text/html
Vary: Accept-Language
Cache-Control: public, max-age=300

Most CDNs respect Vary, but with caveats:

Vary: * (vary by everything) is treated as "do not cache" by most CDNs.
Vary: Cookie is technically valid but explodes the cache key space (every distinct cookie value becomes a separate entry); most CDNs explicitly do not honor it.
CDNs often have their own configuration that overrides Vary (CloudFront's "forward headers to origin" setting, Cloudflare's cache rules).

I treat Vary as a hint, not as a contract. The CDN-side configuration is what actually controls the cache key. Vary is useful for downstream caches (browsers, intermediate proxies) but not load-bearing for the CDN itself.

TTL choices and what they actually mean

The TTL (max-age) is the time after which a cached response is stale. The CDN can return a stale response (with a stale indicator) or revalidate against the origin (If-None-Match, If-Modified-Since).

TTL bands and rough use cases
  Static assets with hashed filenames    1 year (immutable)
  HTML for marketing pages               5-15 minutes
  Logged-in user pages                   not cached (Cache-Control: private)
  API responses                          0-30 seconds, often not cached

Hashed filenames are the trick that makes long TTLs safe for static assets. A bundle named app.a3f2c1b9.js is content-addressed: changing the content changes the filename, so a long TTL on the old filename is harmless because nobody requests the old filename anymore. This is what build tools (webpack, vite, esbuild) do by default and it is a major reason single-page apps can ship aggressive caching.

For HTML, TTL is a trade-off between freshness and origin load. Five minutes is a good default for marketing pages; longer than that and editorial changes feel slow to propagate. Less than that and you lose most of the CDN benefit.

For API responses, my default is to not cache them at all. The exceptions are public endpoints (a public catalog API) where the response is the same for every user and can tolerate seconds-of-staleness. Anything user-specific should be Cache-Control: private (cache in the user's browser only, not in shared caches).

Origin shields: the cache layer behind the cache

A common CDN feature is the origin shield: a designated cache layer that sits between the edge nodes and your origin. Every cache miss from any edge goes through the shield. The shield caches the origin's response and serves it back to the edge that asked for it; subsequent misses from other edges hit the shield instead of the origin.

With origin shield
  user -> nearest edge node -> origin shield (also a cache) -> origin
  if shield has a fresh copy, no origin hit

Without origin shield
  user -> nearest edge node -> origin
  every miss from every edge hits origin

The shield's job is to absorb cache miss traffic. Without a shield, a cold object is fetched from origin once per edge node. With twenty edge nodes and one cold object, that is twenty origin hits. With a shield, it is one origin hit. For high-traffic sites, this is the difference between origin handling 10,000 RPS and 200 RPS during a cache flush.

The trade-off is that the shield adds a hop for cache-miss requests, increasing miss latency by a few milliseconds. For mostly-cache-hit traffic, that latency is invisible (the hit path does not go through the shield). For miss-heavy traffic, the shield is paying for itself by reducing origin load.

Most large CDNs offer this as a configuration option (CloudFront's Origin Shield, Cloudflare's Tiered Cache, Fastly's Origin Shield). Enabling it is a two-line config change with a real win for any site that has more than a handful of edge regions. I would enable it by default unless I had a specific reason not to.

Five ways CDN configs break

Five failure modes I have seen:

Cache key missing a relevant request property. The dashboard incident is the canonical case. Audit your cache rules: for every endpoint that returns user-specific data, the cache key must include something user-identifying or the endpoint must be marked uncacheable.
Cache key including an irrelevant request property. If the cache key includes the User-Agent header, every browser version gets a separate cache entry. Hit rate plummets, origin load rises. The fix is to normalize User-Agent into broader buckets (mobile vs desktop, by major version) or omit it from the key.
TTL too long for the freshness requirement. A marketing page with a one-day TTL feels slow when content is updated; users see stale data for up to a day. The fix is shorter TTL plus an explicit cache invalidation API call on content publish.
Cache invalidation that does not actually purge. Most CDNs offer a purge API; some are eventually consistent and take minutes to propagate. If you publish content and immediately tell the CDN to purge, the purge may not be effective for a few minutes. Plan for it; do not assume purges are instant.
Cookies leaking through the cache. Default cache configurations often forward all cookies but do not include them in the cache key. This means the cached response includes one user's cookie in the Set-Cookie header, served to other users. The fix is to strip cookies from cached responses (Cache-Control: no-store for any response that sets cookies) or to bypass the cache for cookied requests.

One more failure mode worth calling out: the cache stampede. A popular asset's TTL expires, and a thousand concurrent requests all miss the cache simultaneously. Without any protection, the CDN forwards a thousand parallel fetches to the origin. The origin sees a sudden 1000x spike and may fall over. The standard mitigation is request coalescing (some CDNs call it "request collapsing" or "single connection"): when many concurrent requests arrive for the same uncached URL, only one is forwarded to the origin and the rest wait for the response. Most modern CDNs do this automatically; older or self-managed setups (Varnish without vcl_hit/vcl_miss tuning) may not. Verify your CDN's behavior under stampede before you find out the hard way.

Cache-Control directives that actually matter

The Cache-Control response header is how the origin tells the CDN (and browsers) how to cache. The directives I use most:

Cache-Control directives
  public                cacheable by shared caches (CDNs, proxies)
  private               cacheable only by the user's browser
  no-store              do not cache anywhere
  max-age=N             cache for N seconds
  s-maxage=N            cache for N seconds in shared caches (overrides max-age)
  must-revalidate       revalidate with origin when stale (do not serve stale)
  stale-while-revalidate=N  serve stale for N seconds while fetching fresh
  immutable             content will never change for this URL (long TTL safe)

stale-while-revalidate is underused and worth highlighting. It tells the CDN: "if the cached entry is stale, serve it anyway and revalidate in the background." The user gets a fast response (no waiting for revalidation); the cache gets refreshed for the next user. This is the mechanism behind the snappy feel of well-tuned content sites.

What I would set up for a fresh site

If I were configuring a CDN for a new site today:

Static assets with hashed filenames: Cache-Control: public, max-age=31536000, immutable. One year, cached aggressively.
HTML for marketing pages: Cache-Control: public, max-age=300, stale-while-revalidate=86400. Five-minute fresh window, day-long stale window during which the CDN serves stale and revalidates in the background.
API responses (public): Cache-Control: public, max-age=30, stale-while-revalidate=600 if the data tolerates 30-second staleness; otherwise no-store.
API responses (private): Cache-Control: private, no-store. Do not let shared caches near user-specific data.
Origin shield: enabled.
Cache key: host + path + normalized-query, with explicit per-route overrides where headers or cookies matter.
Purge API: integrated into the content publishing pipeline.

That configuration takes about an hour to set up on most CDNs and prevents most of the incidents I described above.

A position to defend

CDNs are sold as performance products and they deliver on that, but the configuration surface that determines correctness (cache keys, Cache-Control, purge semantics) is the part teams skip. You can run a CDN with default settings and have it work well for static content; you cannot run a CDN with default settings and have it work safely for dynamic content with cookies or session data. Any team adding a CDN to an authenticated app should treat the cache configuration with the same review rigor as a database migration: incorrect changes have user-visible blast radius and "we'll fix it later" is not a real plan when "later" is after a data leak. Start with private, no-store for everything authenticated and grow the cacheable surface deliberately, not the other way around.

Back to Articles