System Design Article

Consistency Models (Strong, Eventual, Causal)

Difficulty: Medium

Consistency models are the contract between a distributed data store and its clients about what they can and cannot observe. This lesson walks the spectrum from strict serializability at the strong end to eventual consistency at the relaxed end, with stops at linearizability, sequential, causal, read-your-writes, monotonic reads, and monotonic writes. We focus on what each model promises, what bugs it prevents, what it costs in latency and availability, and which production systems implement it. By the end you can name the model your system needs and explain why - the senior-level move that interviewers reward.

Consistency Models (Strong, Eventual, Causal)

System Design

Medium

consistency

strong-consistency

eventual-consistency

causal-consistency

distributed-systems

cap-theorem

system-design

intermediate

free

911 views

The Consistency Spectrum

Consistency models live on a spectrum. The strong end gives the application a single, coherent view of the world; the weak end gives it the messy reality of replication lag and concurrent writes, in exchange for lower latency and higher availability.

Text

---------- The consistency spectrum ----------
  STRONGER (slower, less available)
  |
  | Strict serializability
  | Linearizability
  | Sequential consistency
  | Causal consistency
  | Read-your-writes / Monotonic reads / Monotonic writes (session guarantees)
  | Eventual consistency
  |
  WEAKER (faster, more available)

The further down you go, the cheaper it is to provide and the more anomalies the application has to deal with. The right model is the weakest one that the user can tolerate.

Strong Consistency: Linearizability

Linearizability (also called atomic consistency) is the gold standard. Every operation appears to take effect at a single instant, instantaneously visible to all clients, in the order they actually happened in real time.

If client A writes x = 1 at 10:00:00.000 and client B reads x at 10:00:00.001, B sees 1. There is no way to observe a state that was not the most recent committed state.

Implementation cost: requires synchronous coordination on every write. Every write must be acknowledged by enough replicas to guarantee that any subsequent read sees it. This typically means a consensus round trip (Raft, Paxos, or 2PC) on every write.

Latency cost: 10 ms within a region, 100 ms across regions. Spanner notably uses TrueTime to bound clock uncertainty and add a 'commit wait' to give global linearizability.

Availability cost: a partition that prevents quorum stops writes entirely.

Examples: Google Spanner, etcd, ZooKeeper, single-instance PostgreSQL, MongoDB with w: majority and readConcern: linearizable, FoundationDB.

Strict Serializability

Linearizability for single-key operations + serializability for multi-key transactions. A transaction's effects appear to take place atomically and in real-time order. This is what most people actually want when they say 'strong consistency'.

Spanner provides strict serializability. Most other 'strongly consistent' systems provide one or the other: linearizability per-key (etcd, Cassandra QUORUM) or serializable transactions on a single shard (PostgreSQL).

Sequential Consistency

Slightly weaker than linearizability. All clients agree on the order of operations, but that order does not have to match real time. If A writes at 10:00:00 and B reads at 10:00:01, B might see the old value - as long as everyone agrees on the same global ordering.

Sequential consistency is rarely a deliberate choice in modern systems; it shows up implicitly in some replicated state machines. Useful to know it exists so you can recognize it in academic papers.

Causal Consistency

A pragmatic middle ground. The system guarantees that operations causally related to each other are observed in the right order, while operations with no causal relationship can be observed in any order.

Causal relationships:

A reply must be observed after the message it replies to.
A photo upload must be observed before the like that references it.
An update to an account balance must be observed before the email that confirms the new balance.

Text

---------- Causal consistency in a chat ----------
  user A: 'I'm at the cafe'        (write 1)
  user B: 'On my way!'             (write 2, depends on read of write 1)

  Required: every observer sees write 1 BEFORE write 2.
  Allowed: an unrelated write 3 ('weather: sunny') can appear in any order.

Implementation: clients track a vector of seen writes; the server only serves a read that includes all causally-prior writes the client has seen. Often implemented with vector clocks or Lamport timestamps.

Examples: COPS, Bayou, MongoDB with causal consistency sessions, Riak with causal context, AntidoteDB.

When to use: chat systems, social networks, collaborative editors, anywhere a user could observe an effect before its cause.

Eventual Consistency

The weakest model in common use. The only guarantee: in the absence of new writes, all replicas eventually converge to the same value. 'Eventually' is unbounded by definition; in practice it is milliseconds to seconds.

Eventual consistency does not promise:

That you can read what you just wrote.
That successive reads return values in any sensible order.
That related writes are observed together.

Text

---------- Eventual consistency in action ----------
  T0  client writes 'name=Bob' to replica A
  T0  replica B still has 'name=Alice'
  T1  client reads from B -> 'Alice' (stale)
  T2  replica B receives the write
  T3  client reads from B -> 'Bob' (now consistent)

This is the default in DynamoDB, Cassandra (with low consistency levels), Riak, S3 (until 2020 it was eventually consistent for overwrites).

When to use: anywhere the cost of an extra second of staleness is lower than the cost of a slower system. Like counts, view counts, recommendation feeds, IoT telemetry, click-stream ingestion, leaderboard updates.

Session Guarantees: The Useful Middle

For most user-facing applications, you do not actually need linearizability across all clients - you need each client's own session to behave sanely. The four session guarantees cover the most common needs.

Read-Your-Writes

Guarantee: after a client writes, that same client always reads its own write back.

Why you need it: a user updates their profile, refreshes the page, and expects to see the change. Without this guarantee, the refresh might hit a stale follower.

Implementation:

Route reads of a session's own data to the leader for a short window after a write.
Track the latest write timestamp per session and only read from a follower that has caught up.
Keep a write-through cache that makes the new value visible immediately.

Monotonic Reads

Guarantee: if a client reads value v at time T, it never reads an older value at time T + 1. Reads only ever move forward in time.

Why you need it: a user keeps refreshing a stock price; the value goes 100 -> 101 -> 100 -> 102. The dip to 100 looks like a real price movement but is just the second read hitting a stale follower.

Implementation: pin a session to a single follower (sticky routing) or track the last-seen timestamp.

Monotonic Writes

Guarantee: writes from the same client are applied in the order issued.

Why you need it: a user clicks 'add to cart' for items A, B, C in order. Without monotonic writes, the cart might end up with A, C, B - or worse, missing one because writes were applied out of order and one was discarded.

Implementation: serialize per-session writes, often by routing all writes from a session through the same partition.

Writes-Follow-Reads

Guarantee: a write that happens after a read sees the value of (or later than) what was read.

Why you need it: a user reads an old comment, replies to it, and the reply must be visible to others as an answer to the original (not appear before it).

Decision Matrix

Workload	Recommended model	Reason
Bank ledger	Strict serializability	Money cannot be temporarily wrong.
Distributed lock service	Linearizability	Authoritative answer required.
Inventory checkout	Linearizability + serializable transactions	No oversell.
Chat / messaging	Causal	Reply must follow original; ordering between unrelated chats does not matter.
Social feed timeline	Causal + read-your-writes	Your own post must show in your feed; others can see it slightly delayed.
Shopping cart	Read-your-writes + monotonic reads	The user must see their cart updates; cross-user consistency does not matter.
Like counts, view counts	Eventual	A few seconds of stale count is fine; throughput matters more.
Click-stream / telemetry ingestion	Eventual	Drop none, dedupe later.
Configuration / service discovery	Linearizability	Wrong config sent to thousands of nodes is catastrophic.
Real-time analytics dashboard	Eventual	Stale numbers are fine; a missing dashboard is not.

Implementation Patterns

Pattern: read-your-writes via session token

The client holds a token (last-write-timestamp or vector clock) returned by the database. On the next read, the client sends the token; the database routes the read to a replica that has caught up to it.

JavaScript

Python

async function updateProfile(userId, patch) {
    const result = await db.write({ table: 'profiles', id: userId, patch });
    sessionStorage.setItem('profileToken', result.commitToken);
    return result;
}

async function readProfile(userId) {
    const token = sessionStorage.getItem('profileToken');
    return await db.read({
        table: 'profiles',
        id: userId,
        atLeast: token, // server picks a replica caught up to this token
    });
}

Pattern: causal consistency via vector clocks

Each write is tagged with a vector clock. When a client reads, it observes the vector clock of the read. When the same client writes next, it includes the read clock as a 'happens-after' constraint. The server only commits if all causally-prior writes are present on the replica.

MongoDB does this automatically via causal-consistency sessions: client.startSession({causalConsistency: true}).

Pattern: tunable consistency in Cassandra

Cassandra exposes consistency as a per-request parameter. With N=3 replicas:

SQL

-- Strong (linearizable for the key): R + W > N
UPDATE users USING CONSISTENCY QUORUM SET name = 'Bob' WHERE id = 1;
SELECT name FROM users USING CONSISTENCY QUORUM WHERE id = 1;

-- Eventual (fast): R = W = ONE
UPDATE users USING CONSISTENCY ONE SET name = 'Bob' WHERE id = 1;
SELECT name FROM users USING CONSISTENCY ONE WHERE id = 1;

How Real Systems Advertise Consistency

System	Default	Strongest available	Notes
Google Spanner	Strict serializability	Strict serializability	TrueTime makes global linearizability practical.
FoundationDB	Strict serializability	Strict serializability	Single-shard transactions; ordered KV model.
PostgreSQL (single node)	Read-committed	Serializable	Single-node ACID; replication is async by default.
PostgreSQL (with sync standby)	Read-committed (linear per key)	Serializable	Sync replication adds latency.
MongoDB	`w:1`, `readConcern: local`	`w: majority` + `readConcern: linearizable`	Causal-consistency sessions available.
Cassandra	Per-query (often ONE)	QUORUM/QUORUM (linearizable per key); LWT for compare-and-set	LWT uses Paxos and is much slower.
DynamoDB	Eventually consistent reads	Strongly consistent reads (per-request)	Writes are always eventually replicated; transactions add atomicity.
Redis (single node)	Strong	Strong	Single-threaded executor; cluster mode is async, eventual.
etcd / ZooKeeper	Linearizable writes, sequential reads	Linearizable reads (opt-in)	Optimized for control plane, not high QPS.

How to Talk About This in an Interview

Pick the model first, then justify. 'For the chat system I would use causal consistency, because the requirement is that replies appear after their parent message; cross-conversation ordering does not matter.'
Mention the cost. 'Strong consistency would solve the same problem but at higher latency and lower availability during partitions, which is overkill here.'
Name the implementation primitive. 'I would use vector clocks or rely on MongoDB causal sessions.'
Distinguish per-feature consistency. 'Within the same product, billing uses strict serializability, the user feed uses causal, and the recommendation widget uses eventual. Mixing models is the right answer.'
Acknowledge session guarantees as the practical default. 'For most user-facing flows, read-your-writes plus monotonic reads is enough. Full linearizability is rarely needed and rarely worth the latency.'

Quick Review

Linearizability: every op happens at a single moment; everyone agrees in real-time order.
Strict serializability: linearizable + serializable transactions.
Causal: causally-related ops are ordered; unrelated ops are not.
Eventual: replicas converge in the absence of writes; no other guarantees.
Session guarantees (read-your-writes, monotonic reads, monotonic writes) cover most user-facing needs cheaply.
Pick the weakest model the user can tolerate; you pay in latency and availability for stronger.

Real-World Examples

How real systems implement this in production

Google Spanner (strict serializability)

Spanner uses TrueTime (atomic-clock-synchronized timestamps with bounded uncertainty) plus Paxos to provide strict serializability for SQL transactions across continents. Every commit waits out the clock-uncertainty window before acknowledging, ensuring that any later transaction sees the new state.

Trade-off: You get the strongest possible consistency at global scale, but pay for it: TrueTime requires hardware (atomic clocks + GPS), and writes pay a 'commit wait' of about 7 ms plus a Paxos round trip. The latency is a non-starter for high-frequency workloads but cheap for transactional ones.

MongoDB causal-consistency sessions

MongoDB lets a client open a session with `causalConsistency: true`. The driver tracks a cluster time per session; subsequent reads include the cluster time, so the server only serves a read from a replica that has caught up. This gives read-your-writes, monotonic reads, and writes-follow-reads in one feature.

Trade-off: Causal sessions add a tiny per-request overhead (the cluster-time token) but eliminate the most common 'I just wrote it and it disappeared' bugs without the latency cost of full linearizability. The default in modern MongoDB applications.

DynamoDB eventually consistent reads

DynamoDB's default read returns whatever the nearest replica has, which may lag the latest write by milliseconds. Strongly consistent reads are an opt-in per-request flag that costs 2x read capacity units. Most workloads use the default - shopping carts, sessions, IoT data - because the cost of a 50 ms-stale read is zero.

Trade-off: Defaulting to eventual halves the cost and improves latency. Workloads that genuinely need strong consistency (compare-and-set, balance reads) opt in per request, paying only where it matters.

Riak with vector clocks

Riak (Dynamo-style KV store) uses vector clocks to track the causal history of every value. When a read returns multiple conflicting siblings (concurrent writes), the application receives all siblings and the vector clocks; it must merge them and write back the resolved value. This makes causal relationships explicit at the API level.

Trade-off: Pushing conflict resolution to the application is more code but lets the data store stay highly available. Riak deployments that handled this well (Spotify, Bet365) tended to use CRDTs - data types that merge automatically and never produce conflicts the application has to resolve manually.

Quick Interview Phrases

Key terms to use in your answer

linearizability

causal consistency

read-your-writes

monotonic reads

vector clocks

session guarantees

Common Interview Questions

Questions you might be asked about this topic

Walk through the consistency model you would pick for a global chat application like Slack or WhatsApp.

Causal consistency is the right default. The hard requirement is that messages within a conversation appear in causal order (reply after parent). Cross-conversation ordering does not matter. Add read-your-writes per session so a user always sees their own messages immediately. Implementation: causal-consistency sessions in MongoDB, or vector clocks in a custom data store. Avoid full linearizability because the cost (cross-region quorum round trip on every send) would tank latency for a feature that does not need it.

What is the difference between linearizability and serializability?

A user updates their email and immediately reloads the page, but sees the old email. Which consistency model is missing? How do you fix it?

When would you use eventual consistency in a system you design?

Explain how MongoDB lets you tune consistency.

Interview Tips

How to discuss this topic effectively

Pick the weakest model that satisfies the user requirement, then justify it. 'For chat I would use causal consistency because the only ordering that matters is reply-after-parent' is a much stronger answer than 'strong consistency for safety'.

Distinguish per-feature consistency. A real system uses strong for billing, causal for chat, and eventual for like counts. Mixing models per use case is the senior-level move.

Always name the primitive: vector clocks for causal, Paxos/Raft for linearizable, LWW or CRDTs for eventual conflict resolution. Name-dropping the mechanism shows you have implemented this, not just read about it.

Mention session guarantees explicitly. Read-your-writes is what users actually expect from 'consistency'; full linearizability is overkill for most user-facing flows.

Acknowledge the cost in latency and availability. 'Linearizable global reads cost a quorum round trip - 100 ms cross-region. The product team will not accept that for a feed widget.'

Common Mistakes

Pitfalls to avoid in interviews

Using 'consistency' to mean only the C in ACID

ACID consistency means transaction-level integrity (constraints hold). CAP/distributed consistency is about what clients can observe across replicas. The two are different concepts that share a name. Always clarify which one you mean.

Treating eventual consistency as 'sometimes consistent, sometimes not'

Eventual consistency is a precise model with bounded convergence (typically milliseconds). It is not random. Many real applications (DNS, caches, social feeds) work fine on eventual consistency for decades.

Asking for linearizability when read-your-writes is enough

Most user-facing 'consistency' issues are actually session-guarantee issues. A user expects to see their own change reflected. They do not care if a different user sees a slightly older state. Implementing read-your-writes is much cheaper than full linearizability.

Assuming consistency is a per-database choice

Modern databases let you tune consistency per-request (Cassandra consistency levels, MongoDB read/write concerns, DynamoDB consistent reads). A single application can mix strong and eventual on different queries against the same database.

Ignoring causal anomalies in chat or social systems

Without causal consistency, a reply can appear before the message it replies to, or a like can appear before the photo it likes. Users notice and file bug reports. For chat, social, and collaborative apps, causal is the minimum sensible default.

Back to System Design