System Design Article
Consistency Models (Strong, Eventual, Causal)
Difficulty: Medium
Consistency models are the contract between a distributed data store and its clients about what they can and cannot observe. This lesson walks the spectrum from strict serializability at the strong end to eventual consistency at the relaxed end, with stops at linearizability, sequential, causal, read-your-writes, monotonic reads, and monotonic writes. We focus on what each model promises, what bugs it prevents, what it costs in latency and availability, and which production systems implement it. By the end you can name the model your system needs and explain why - the senior-level move that interviewers reward.
Consistency Models (Strong, Eventual, Causal)
Consistency models are the contract between a distributed data store and its clients about what they can and cannot observe. This lesson walks the spectrum from strict serializability at the strong end to eventual consistency at the relaxed end, with stops at linearizability, sequential, causal, read-your-writes, monotonic reads, and monotonic writes. We focus on what each model promises, what bugs it prevents, what it costs in latency and availability, and which production systems implement it. By the end you can name the model your system needs and explain why - the senior-level move that interviewers reward.
911 views
4
The Consistency Spectrum
Consistency models live on a spectrum. The strong end gives the application a single, coherent view of the world; the weak end gives it the messy reality of replication lag and concurrent writes, in exchange for lower latency and higher availability.
---------- The consistency spectrum ----------
STRONGER (slower, less available)
|
| Strict serializability
| Linearizability
| Sequential consistency
| Causal consistency
| Read-your-writes / Monotonic reads / Monotonic writes (session guarantees)
| Eventual consistency
|
WEAKER (faster, more available)The further down you go, the cheaper it is to provide and the more anomalies the application has to deal with. The right model is the weakest one that the user can tolerate.
Strong Consistency: Linearizability
Linearizability (also called atomic consistency) is the gold standard. Every operation appears to take effect at a single instant, instantaneously visible to all clients, in the order they actually happened in real time.
If client A writes x = 1 at 10:00:00.000 and client B reads x at 10:00:00.001, B sees 1. There is no way to observe a state that was not the most recent committed state.
Implementation cost: requires synchronous coordination on every write. Every write must be acknowledged by enough replicas to guarantee that any subsequent read sees it. This typically means a consensus round trip (Raft, Paxos, or 2PC) on every write.
Latency cost: 10 ms within a region, 100 ms across regions. Spanner notably uses TrueTime to bound clock uncertainty and add a 'commit wait' to give global linearizability.
Availability cost: a partition that prevents quorum stops writes entirely.
Examples: Google Spanner, etcd, ZooKeeper, single-instance PostgreSQL, MongoDB with w: majority and readConcern: linearizable, FoundationDB.
Strict Serializability
Linearizability for single-key operations + serializability for multi-key transactions. A transaction's effects appear to take place atomically and in real-time order. This is what most people actually want when they say 'strong consistency'.
Spanner provides strict serializability. Most other 'strongly consistent' systems provide one or the other: linearizability per-key (etcd, Cassandra QUORUM) or serializable transactions on a single shard (PostgreSQL).
Sequential Consistency
Slightly weaker than linearizability. All clients agree on the order of operations, but that order does not have to match real time. If A writes at 10:00:00 and B reads at 10:00:01, B might see the old value - as long as everyone agrees on the same global ordering.
Sequential consistency is rarely a deliberate choice in modern systems; it shows up implicitly in some replicated state machines. Useful to know it exists so you can recognize it in academic papers.
Causal Consistency
A pragmatic middle ground. The system guarantees that operations causally related to each other are observed in the right order, while operations with no causal relationship can be observed in any order.
Causal relationships:
- A reply must be observed after the message it replies to.
- A photo upload must be observed before the like that references it.
- An update to an account balance must be observed before the email that confirms the new balance.
---------- Causal consistency in a chat ----------
user A: 'I'm at the cafe' (write 1)
user B: 'On my way!' (write 2, depends on read of write 1)
Required: every observer sees write 1 BEFORE write 2.
Allowed: an unrelated write 3 ('weather: sunny') can appear in any order.Implementation: clients track a vector of seen writes; the server only serves a read that includes all causally-prior writes the client has seen. Often implemented with vector clocks or Lamport timestamps.
Examples: COPS, Bayou, MongoDB with causal consistency sessions, Riak with causal context, AntidoteDB.
When to use: chat systems, social networks, collaborative editors, anywhere a user could observe an effect before its cause.
Eventual Consistency
The weakest model in common use. The only guarantee: in the absence of new writes, all replicas eventually converge to the same value. 'Eventually' is unbounded by definition; in practice it is milliseconds to seconds.
Eventual consistency does not promise:
- That you can read what you just wrote.
- That successive reads return values in any sensible order.
- That related writes are observed together.
---------- Eventual consistency in action ----------
T0 client writes 'name=Bob' to replica A
T0 replica B still has 'name=Alice'
T1 client reads from B -> 'Alice' (stale)
T2 replica B receives the write
T3 client reads from B -> 'Bob' (now consistent)This is the default in DynamoDB, Cassandra (with low consistency levels), Riak, S3 (until 2020 it was eventually consistent for overwrites).
When to use: anywhere the cost of an extra second of staleness is lower than the cost of a slower system. Like counts, view counts, recommendation feeds, IoT telemetry, click-stream ingestion, leaderboard updates.
Session Guarantees: The Useful Middle
For most user-facing applications, you do not actually need linearizability across all clients - you need each client's own session to behave sanely. The four session guarantees cover the most common needs.
Read-Your-Writes
Guarantee: after a client writes, that same client always reads its own write back.
Why you need it: a user updates their profile, refreshes the page, and expects to see the change. Without this guarantee, the refresh might hit a stale follower.
Implementation:
- Route reads of a session's own data to the leader for a short window after a write.
- Track the latest write timestamp per session and only read from a follower that has caught up.
- Keep a write-through cache that makes the new value visible immediately.
Monotonic Reads
Guarantee: if a client reads value v at time T, it never reads an older value at time T + 1. Reads only ever move forward in time.
Why you need it: a user keeps refreshing a stock price; the value goes 100 -> 101 -> 100 -> 102. The dip to 100 looks like a real price movement but is just the second read hitting a stale follower.
Implementation: pin a session to a single follower (sticky routing) or track the last-seen timestamp.
Monotonic Writes
Guarantee: writes from the same client are applied in the order issued.
Why you need it: a user clicks 'add to cart' for items A, B, C in order. Without monotonic writes, the cart might end up with A, C, B - or worse, missing one because writes were applied out of order and one was discarded.
Implementation: serialize per-session writes, often by routing all writes from a session through the same partition.
Writes-Follow-Reads
Guarantee: a write that happens after a read sees the value of (or later than) what was read.
Why you need it: a user reads an old comment, replies to it, and the reply must be visible to others as an answer to the original (not appear before it).
Decision Matrix
| Workload | Recommended model | Reason |
|---|---|---|
| Bank ledger | Strict serializability | Money cannot be temporarily wrong. |
| Distributed lock service | Linearizability | Authoritative answer required. |
| Inventory checkout | Linearizability + serializable transactions | No oversell. |
| Chat / messaging | Causal | Reply must follow original; ordering between unrelated chats does not matter. |
| Social feed timeline | Causal + read-your-writes | Your own post must show in your feed; others can see it slightly delayed. |
| Shopping cart | Read-your-writes + monotonic reads | The user must see their cart updates; cross-user consistency does not matter. |
| Like counts, view counts | Eventual | A few seconds of stale count is fine; throughput matters more. |
| Click-stream / telemetry ingestion | Eventual | Drop none, dedupe later. |
| Configuration / service discovery | Linearizability | Wrong config sent to thousands of nodes is catastrophic. |
| Real-time analytics dashboard | Eventual | Stale numbers are fine; a missing dashboard is not. |
Implementation Patterns
Pattern: read-your-writes via session token
The client holds a token (last-write-timestamp or vector clock) returned by the database. On the next read, the client sends the token; the database routes the read to a replica that has caught up to it.
async function updateProfile(userId, patch) {
const result = await db.write({ table: 'profiles', id: userId, patch });
sessionStorage.setItem('profileToken', result.commitToken);
return result;
}
async function readProfile(userId) {
const token = sessionStorage.getItem('profileToken');
return await db.read({
table: 'profiles',
id: userId,
atLeast: token, // server picks a replica caught up to this token
});
}Pattern: causal consistency via vector clocks
Each write is tagged with a vector clock. When a client reads, it observes the vector clock of the read. When the same client writes next, it includes the read clock as a 'happens-after' constraint. The server only commits if all causally-prior writes are present on the replica.
MongoDB does this automatically via causal-consistency sessions: client.startSession({causalConsistency: true}).
Pattern: tunable consistency in Cassandra
Cassandra exposes consistency as a per-request parameter. With N=3 replicas:
-- Strong (linearizable for the key): R + W > N
UPDATE users USING CONSISTENCY QUORUM SET name = 'Bob' WHERE id = 1;
SELECT name FROM users USING CONSISTENCY QUORUM WHERE id = 1;
-- Eventual (fast): R = W = ONE
UPDATE users USING CONSISTENCY ONE SET name = 'Bob' WHERE id = 1;
SELECT name FROM users USING CONSISTENCY ONE WHERE id = 1;How Real Systems Advertise Consistency
| System | Default | Strongest available | Notes |
|---|---|---|---|
| Google Spanner | Strict serializability | Strict serializability | TrueTime makes global linearizability practical. |
| FoundationDB | Strict serializability | Strict serializability | Single-shard transactions; ordered KV model. |
| PostgreSQL (single node) | Read-committed | Serializable | Single-node ACID; replication is async by default. |
| PostgreSQL (with sync standby) | Read-committed (linear per key) | Serializable | Sync replication adds latency. |
| MongoDB | w:1, readConcern: local | w: majority + readConcern: linearizable | Causal-consistency sessions available. |
| Cassandra | Per-query (often ONE) | QUORUM/QUORUM (linearizable per key); LWT for compare-and-set | LWT uses Paxos and is much slower. |
| DynamoDB | Eventually consistent reads | Strongly consistent reads (per-request) | Writes are always eventually replicated; transactions add atomicity. |
| Redis (single node) | Strong | Strong | Single-threaded executor; cluster mode is async, eventual. |
| etcd / ZooKeeper | Linearizable writes, sequential reads | Linearizable reads (opt-in) | Optimized for control plane, not high QPS. |
How to Talk About This in an Interview
- Pick the model first, then justify. 'For the chat system I would use causal consistency, because the requirement is that replies appear after their parent message; cross-conversation ordering does not matter.'
- Mention the cost. 'Strong consistency would solve the same problem but at higher latency and lower availability during partitions, which is overkill here.'
- Name the implementation primitive. 'I would use vector clocks or rely on MongoDB causal sessions.'
- Distinguish per-feature consistency. 'Within the same product, billing uses strict serializability, the user feed uses causal, and the recommendation widget uses eventual. Mixing models is the right answer.'
- Acknowledge session guarantees as the practical default. 'For most user-facing flows, read-your-writes plus monotonic reads is enough. Full linearizability is rarely needed and rarely worth the latency.'
Quick Review
- Linearizability: every op happens at a single moment; everyone agrees in real-time order.
- Strict serializability: linearizable + serializable transactions.
- Causal: causally-related ops are ordered; unrelated ops are not.
- Eventual: replicas converge in the absence of writes; no other guarantees.
- Session guarantees (read-your-writes, monotonic reads, monotonic writes) cover most user-facing needs cheaply.
- Pick the weakest model the user can tolerate; you pay in latency and availability for stronger.
Real-World Examples
How real systems implement this in production
Spanner uses TrueTime (atomic-clock-synchronized timestamps with bounded uncertainty) plus Paxos to provide strict serializability for SQL transactions across continents. Every commit waits out the clock-uncertainty window before acknowledging, ensuring that any later transaction sees the new state.
Trade-off: You get the strongest possible consistency at global scale, but pay for it: TrueTime requires hardware (atomic clocks + GPS), and writes pay a 'commit wait' of about 7 ms plus a Paxos round trip. The latency is a non-starter for high-frequency workloads but cheap for transactional ones.
MongoDB lets a client open a session with `causalConsistency: true`. The driver tracks a cluster time per session; subsequent reads include the cluster time, so the server only serves a read from a replica that has caught up. This gives read-your-writes, monotonic reads, and writes-follow-reads in one feature.
Trade-off: Causal sessions add a tiny per-request overhead (the cluster-time token) but eliminate the most common 'I just wrote it and it disappeared' bugs without the latency cost of full linearizability. The default in modern MongoDB applications.
DynamoDB's default read returns whatever the nearest replica has, which may lag the latest write by milliseconds. Strongly consistent reads are an opt-in per-request flag that costs 2x read capacity units. Most workloads use the default - shopping carts, sessions, IoT data - because the cost of a 50 ms-stale read is zero.
Trade-off: Defaulting to eventual halves the cost and improves latency. Workloads that genuinely need strong consistency (compare-and-set, balance reads) opt in per request, paying only where it matters.
Riak (Dynamo-style KV store) uses vector clocks to track the causal history of every value. When a read returns multiple conflicting siblings (concurrent writes), the application receives all siblings and the vector clocks; it must merge them and write back the resolved value. This makes causal relationships explicit at the API level.
Trade-off: Pushing conflict resolution to the application is more code but lets the data store stay highly available. Riak deployments that handled this well (Spotify, Bet365) tended to use CRDTs - data types that merge automatically and never produce conflicts the application has to resolve manually.
Quick Interview Phrases
Key terms to use in your answer
Common Interview Questions
Questions you might be asked about this topic
Causal consistency is the right default. The hard requirement is that messages within a conversation appear in causal order (reply after parent). Cross-conversation ordering does not matter. Add read-your-writes per session so a user always sees their own messages immediately. Implementation: causal-consistency sessions in MongoDB, or vector clocks in a custom data store. Avoid full linearizability because the cost (cross-region quorum round trip on every send) would tank latency for a feature that does not need it.
Linearizability is about single operations (typically single-key reads and writes) appearing instantaneous and in real-time order. Serializability is about transactions (multi-operation sequences) being equivalent to some serial execution; it does not require real-time order. Strict serializability combines both: transactions are serialized AND respect real time. Most production systems give you one or the other, not both - Spanner is one of the few that provides strict serializability globally.
Read-your-writes is missing. The reload hit a follower that had not received the update yet. Fixes: (1) route the user's own reads to the leader for ~30s after a write; (2) include a session token (last-write timestamp) in the read so the server picks a follower caught up to it; (3) write-through to a per-user cache that is checked before any follower; (4) for low-cost cases, simply read user-specific data from the leader always. Mention the cost trade-off - leader reads do not scale, so target only the user's own data.
Whenever the user can tolerate brief staleness in exchange for higher throughput, lower latency, or higher availability. Concrete examples: like and view counts, recommendation feeds, IoT telemetry ingestion, click-stream pipelines, leaderboards, search index updates, distributed caches. The key check: would a 5-second-old value cause harm? If no, eventual consistency is the right answer. Always pair it with monitoring of replication lag so you know when 'eventual' becomes 'never'.
Two dials: write concern (`w`) and read concern. Write concern: `w: 1` (acked by primary), `w: majority` (acked by majority of replica set), `w: 'all'` (acked by all). Read concern: `local` (whatever the node has), `majority` (only data acked by majority), `linearizable` (always reflects the latest committed write). Combine them: `w: majority` + `readConcern: linearizable` gives true strong consistency; `w: 1` + `readConcern: local` gives eventual. Add causal-consistency sessions for read-your-writes and monotonic reads cheaply.
Interview Tips
How to discuss this topic effectively
Pick the weakest model that satisfies the user requirement, then justify it. 'For chat I would use causal consistency because the only ordering that matters is reply-after-parent' is a much stronger answer than 'strong consistency for safety'.
Distinguish per-feature consistency. A real system uses strong for billing, causal for chat, and eventual for like counts. Mixing models per use case is the senior-level move.
Always name the primitive: vector clocks for causal, Paxos/Raft for linearizable, LWW or CRDTs for eventual conflict resolution. Name-dropping the mechanism shows you have implemented this, not just read about it.
Mention session guarantees explicitly. Read-your-writes is what users actually expect from 'consistency'; full linearizability is overkill for most user-facing flows.
Acknowledge the cost in latency and availability. 'Linearizable global reads cost a quorum round trip - 100 ms cross-region. The product team will not accept that for a feed widget.'
Common Mistakes
Pitfalls to avoid in interviews
Using 'consistency' to mean only the C in ACID
ACID consistency means transaction-level integrity (constraints hold). CAP/distributed consistency is about what clients can observe across replicas. The two are different concepts that share a name. Always clarify which one you mean.
Treating eventual consistency as 'sometimes consistent, sometimes not'
Eventual consistency is a precise model with bounded convergence (typically milliseconds). It is not random. Many real applications (DNS, caches, social feeds) work fine on eventual consistency for decades.
Asking for linearizability when read-your-writes is enough
Most user-facing 'consistency' issues are actually session-guarantee issues. A user expects to see their own change reflected. They do not care if a different user sees a slightly older state. Implementing read-your-writes is much cheaper than full linearizability.
Assuming consistency is a per-database choice
Modern databases let you tune consistency per-request (Cassandra consistency levels, MongoDB read/write concerns, DynamoDB consistent reads). A single application can mix strong and eventual on different queries against the same database.
Ignoring causal anomalies in chat or social systems
Without causal consistency, a reply can appear before the message it replies to, or a like can appear before the photo it likes. Users notice and file bug reports. For chat, social, and collaborative apps, causal is the minimum sensible default.
