A team I joined had a Redis layer in front of Postgres that nobody on the team could draw on a whiteboard. They knew it was "the cache". They did not know whether writes went through it, around it, or behind it. The result was the kind of bug that takes a Saturday: a write succeeded against the database, the cache held the old value for three minutes, and a customer saw their account balance bounce between two numbers depending on which API call they happened to make.
The fix wasn't more Redis. It was a five-minute conversation about which caching strategy we were actually running, and a one-line code change to make the answer match the diagram. That conversation kept happening on every team I have joined since, so I wrote it down.
My stance up front
There are four strategies the books talk about: cache-aside, read-through, write-through, and write-behind. The ones the books do not stress hard enough are: most teams should use cache-aside (read-side) plus write-through (write-side) and stop there. Write-behind is a real tool, but it has a sharp edge. Read-through is mostly an architectural rebrand of cache-aside that moves the call out of your service.
This article walks through each, shows the code, and ends with the cases where I have actually picked write-behind and the cases where I would refuse to.
Cache-aside: the one most code is already running
Cache-aside, sometimes called lazy loading, is the pattern where the application code, not the cache, owns the read miss. On a read, you ask the cache; on a miss, you load from the source of truth, write the value into the cache, and return it. On a write, you write to the database and either invalidate or update the cache.
What I like about it: the cache is a pure performance layer. If Redis is down, the application still works (slower) because every read falls through to the database. Cache-aside is forgiving in the way you want a cache to be forgiving.
What bites: the read path and write path are independent, so you have to think about race conditions yourself. The classic one: thread A reads from DB, thread B updates the row and invalidates the cache, thread A writes the now-stale value into the cache. The mitigation I use most is short TTLs (60-300 seconds) so any race resolves itself within a tolerable window.
The other pattern that works for hot writes: write a tombstone (or version number) to the cache before reading from the DB, so a concurrent writer can detect the stale-write race. This adds complexity I usually do not need.
A subtle behavior worth knowing: the cache-aside-with-update variant. Instead of invalidating on write, you update the cache to the new value while updating the database. The next read is a hit, not a miss. This is faster on read but introduces a stale-write race that pure invalidation does not have: if two writes overlap and the slower one finishes its cache update last, the cache holds older data than the database. The fix is to include a row version (or updatedAt timestamp) in the cache value and only write to the cache if the new version is strictly newer. I default to invalidate-on-write rather than update-on-write because the race is cheaper to reason about, but for read-very-heavy hot keys where the post-write read storm matters, update-on-write is the right choice.
Read-through: same idea, different owner
Read-through looks identical from the application's perspective: ask the cache, get a value back. The difference is who handles the miss. With read-through, the cache itself loads the value from the source of truth on miss; the application never talks to the database directly for cached reads.
This is what some ORM-level caches and some Java caching libraries (Caffeine with a CacheLoader, for example) give you out of the box. From a behavior standpoint it is cache-aside; the value is shifting which component holds the loader function, not changing the read semantics.
I mention it for completeness. In a service I owned in 2024 we tried to standardize on read-through to cut the boilerplate, and the win was real for ten endpoints and zero for the eleventh, where the loader had a non-trivial fallback. We backed out of read-through there and kept the manual cache-aside form. Read-through is a good fit when your loaders are uniform; it is friction when they are not.
Write-through: the safe write-side default
Write-through is what I want most write paths doing. On a write, the application writes to the cache and the database in the same operation, and the call only completes when both have succeeded. The cache is always consistent with the database (within ordering caveats discussed below).
Note: this looks like cache-aside-with-update instead of cache-aside-with-invalidate. The difference is real: write-through guarantees the cache holds a value, which means the next read is a hit. Cache-aside-with-invalidate clears the entry, which means the next read is a miss and pays the DB cost.
The trade-off: write-through is slightly slower on the write path (two round trips instead of one) and it can paper over a stale-write race in the same way cache-aside-with-invalidate does. The race is: two concurrent writers, the older one finishes the cache update last, the cache now holds older data than the database. Mitigation: include the row's version or updatedAt in the cache value, and only write to the cache if the new version is newer.
The other gotcha: if the database write succeeds and the cache write fails, the cache is now stale until the TTL elapses. I always set a TTL on cached entries even when I am writing through, so any inconsistency self-heals within minutes rather than persisting forever.
Write-behind: fast, scary, sometimes correct
Write-behind, also called write-back, decouples the database write from the application's request path. The application writes to the cache; a background worker drains pending changes from the cache into the database asynchronously. The application's response returns as soon as the cache write succeeds, which means the request latency is dominated by the (fast) cache call.
The sharp edge: between the cache write and the eventual database write, the durable state lags. If the cache crashes, the unflushed changes are lost. If the cache and the database disagree, the cache wins until the next flush. Most workloads cannot tolerate that. Some can.
A simplified write-behind shape:
This pattern is right for view counts, like buttons, and similar high-volume telemetry where losing a few seconds of writes during a cache failure is acceptable. It is wrong for anything financial, anything legal, anything where "the user did the thing" needs to be durable the moment the API returns 200.
A decision table
The shortest version of the trade-off:
Most services should use cache-aside or read-through on the read path and write-through on the write path. That covers user-facing CRUD endpoints, profile reads, settings, almost anything that fits behind a key-value lookup. Write-behind is reserved for write-amplified telemetry where the latency win pays for the durability cost.
The mistake I keep seeing
The single most common cache bug I have debugged is not stale data. It is unbounded keys. A team adds a cache, sets a TTL on each entry, and assumes Redis will reclaim memory when entries expire. They do not realize that Redis only evicts on access (or on the configured eviction policy), so a key with a 24-hour TTL written at 9 AM is still consuming RAM at 8:59 AM tomorrow even if no one reads it. Run Redis with maxmemory set, an eviction policy chosen on purpose (allkeys-lru is my default), and key prefixes that let you reason about cardinality.
A small operational story. On one team I joined, the cache was Redis with no eviction policy set; the default noeviction means "return errors when out of memory". The cache hit a memory ceiling on a Friday afternoon, every cache write started failing, and because the application code was not handling the failure (it logged and moved on), the database started taking the full read load. Postgres was sized for cache-fronted traffic, not for the un-cached fallback, and went under within 20 minutes. The fix was a one-line config change to allkeys-lru. I now treat the eviction policy as a thing to verify on day one, not a thing to assume the default is fine.
The second most common bug is forgetting to namespace keys when two services share a Redis instance. user:42 from service A and user:42 from service B will collide. Always prefix: service-a:user:42. The five-character cost in memory is nothing compared to the cross-service bug it prevents.
What I tell new engineers on the team
Three rules I write into onboarding docs:
- Default to cache-aside on reads and write-through on writes. Match the rest of the team's conventions if they exist.
- Always set a TTL. Even on write-through. The TTL is your insurance policy against the inconsistency you do not yet know about.
- Never reach for write-behind unless you can name what data loss you are willing to accept and the durable lower bound on it ("we accept losing up to 30 seconds of view counts on a cache crash"). If you cannot name the loss, you cannot afford the strategy.
The interesting cases come up when a team has read-heavy and write-heavy paths in the same service. The right answer there is to pick a strategy per-endpoint, not per-service. The user-profile read can be cache-aside; the article-view recorder can be write-behind; both can live in the same Redis. What matters is that someone on the team can draw the diagram and the diagram matches the code.
Default to write-through; reach for write-behind only when the loss is named
Caching is the place where teams reach for cleverness too early and architecture too late. The cleverness is the eviction tweak, the warmer, the bloom filter, the multi-tier hierarchy. The architecture is the answer to: which strategy am I running on this path, and what failure mode is acceptable here. I will take a boring write-through with a five-minute TTL over a clever write-behind any day, because the boring one debugs in five minutes and the clever one debugs on a Saturday. Pick the boring one until the data tells you the boring one is too slow, and then pick the next-most-boring thing that fixes the actual measured problem.
