Community Article

Consistent Hashing Explained with a 200-Line Toy

A working Python toy of the ring, with virtual nodes, the bounded-movement test that proves the algorithm earns its complexity, and the cases where I would not reach for it.

Consistent Hashing Explained with a 200-Line Toy

A working Python toy of the ring, with virtual nodes, the bounded-movement test that proves the algorithm earns its complexity, and the cases where I would not reach for it.

consistent-hashing

hashing

distributed-systems

system-design

partitioning

By @ryanjoshi

January 8, 2026

Updated May 18, 2026

299 views

4.4 (10)

When I first read about consistent hashing, I read it as: "hash the keys, hash the nodes, put both on a circle, route a key to the next node clockwise." That sentence is correct. It is also useless, because it does not tell you why the algorithm is in every distributed system you have ever used and a plain hash table is not.

The reason is movement. With hash(key) % N over N shards, adding one shard remaps roughly (N-1)/N of all keys. With consistent hashing, adding one shard remaps roughly 1/(N+1) of the keys. That difference (move every key vs move one key in N+1) is the entire point. Everything else, the ring, the virtual nodes, the hash function choice, exists in service of that one property: bounded movement on resize.

This article walks through a 200-line toy implementation in Python and uses it to make three claims I will defend:

The naive ring (one hash per node) is wrong in production because load is uneven.
Virtual nodes fix the load problem at a cost (more memory, slower lookups).
The win that matters is not the ring shape but the bounded-movement property, which is what makes consistent hashing the right substrate for caches, sharded databases, and distributed object stores.

What I am building

A ConsistentHash class that supports add(node), remove(node), and route(key) -> node. I want to be able to answer: "if I add node X, how many keys move?" and "is the load balanced across nodes?" empirically, by running ten thousand keys through it.

Before I show the code, one piece of intuition. The reason the textbook draws a circle is that a circle has no boundary problem. If you put N points on a line of length L and ask "which point is to the right of position p", the rightmost point has nothing to its right; you have to special-case wrapping. A circle wraps for free. That is the only reason. There is no deep geometric truth about circles, no rotational symmetry that matters, no number-theoretic property of 2*pi. The circle is a wraparound trick. The actual algorithm is a sorted array of integers and a binary search. If anyone tells you they understand consistent hashing because they understand the circle, ask them to explain the bounded-movement property without the circle. If they can, they understand it. If they can't, they have memorized a metaphor.

import hashlib
from bisect import insort, bisect_right

def _h(s: str) -> int:
    # 64-bit slice of MD5 is fine for a toy; do not use this in production crypto contexts.
    return int(hashlib.md5(s.encode()).hexdigest()[:16], 16)

class NaiveRing:
    """One position per node on the ring."""
    def __init__(self):
        self._ring: list[int] = []   # sorted hashes
        self._owner: dict[int, str] = {}  # hash -> node

    def add(self, node: str) -> None:
        h = _h(node)
        if h not in self._owner:
            insort(self._ring, h)
            self._owner[h] = node

    def remove(self, node: str) -> None:
        h = _h(node)
        if h in self._owner:
            del self._owner[h]
            self._ring.remove(h)

    def route(self, key: str) -> str | None:
        if not self._ring:
            return None
        h = _h(key)
        i = bisect_right(self._ring, h) % len(self._ring)
        return self._owner[self._ring[i]]

This is forty lines. It works. The route call is O(log N) thanks to bisect, which is the practical reason the ring is implemented as a sorted array and not as a linked list.

Showing it actually does the bounded-movement thing

def reassignment_count(before: NaiveRing, after: NaiveRing, keys: list[str]) -> int:
    moved = 0
    for k in keys:
        if before.route(k) != after.route(k):
            moved += 1
    return moved

ring = NaiveRing()
for n in ["node-a", "node-b", "node-c", "node-d"]:
    ring.add(n)

keys = [f"key-{i}" for i in range(10_000)]
before_state = {k: ring.route(k) for k in keys}

import copy
after = copy.deepcopy(ring)
after.add("node-e")
moved = sum(1 for k in keys if before_state[k] != after.route(k))
print(f"adding node-e moved {moved}/10000 keys ({moved/100:.1f}%)")

When I run this with four nodes and add a fifth, I see roughly 18 to 22 percent of keys move. Adding a node to a four-node ring should move about one fifth, which is 1/5 = 20%. That matches the theoretical expectation, with normal random variance.

Compare to hash(key) % N, where adding the fifth shard moves all the keys whose hash % 4 != hash % 5, which is around 80 percent. The difference (20 percent vs 80 percent) is what makes consistent hashing usable in a system where moving a key costs real work (cache invalidation, data shipment, rebalancing).

The naive ring is unfair

Here is where the textbook description starts to lie. In theory, N randomly placed points partition a circle into N evenly sized arcs. In practice, with N = 4, you get arcs of wildly different sizes. One node can own 40 percent of the keyspace; another owns 10 percent.

from collections import Counter
counts = Counter(ring.route(k) for k in keys)
for node, n in sorted(counts.items()):
    print(f"{node}: {n:5d} keys ({n/100:.1f}%)")

A typical run with four nodes: node-a: 38.4%, node-b: 26.1%, node-c: 13.7%, node-d: 21.8%. That is a 2.8x spread between the most-loaded and least-loaded node. Production teams cannot accept that. A node owning 38 percent of the load is going to fail first under traffic.

Why is the naive variant so bad at load? Variance. When you place four points uniformly at random on a circle of circumference one, the expected arc length is 0.25 for each, but the standard deviation of the arc lengths is also significant relative to the mean. With only four samples, the law of large numbers has not had time to do its work. The arc lengths follow a distribution where the largest arc can easily be three or four times the smallest. Add more nodes (or more vnodes per node, which is the same idea), and the variance of the arc lengths shrinks because you are averaging over more samples. The mathematics is the same as why a single coin flip has high variance and a thousand coin flips have low variance. Consistent hashing with one position per node has four samples; with two hundred positions per node times four nodes, it has eight hundred. The eight-hundred-sample version is much closer to its mean.

Virtual nodes: the fix and its cost

The fix is to give each physical node multiple positions on the ring. With 200 virtual nodes per physical node, four physical nodes occupy 800 ring positions, the arcs are much more uniform, and the load spread shrinks.

class Ring:
    """Ring with V virtual nodes per physical node."""
    def __init__(self, vnodes: int = 200):
        self._vnodes = vnodes
        self._ring: list[int] = []
        self._owner: dict[int, str] = {}

    def _vkey(self, node: str, i: int) -> str:
        return f"{node}::{i}"

    def add(self, node: str) -> None:
        for i in range(self._vnodes):
            h = _h(self._vkey(node, i))
            if h not in self._owner:
                insort(self._ring, h)
                self._owner[h] = node

    def remove(self, node: str) -> None:
        to_remove = [self._vkey(node, i) for i in range(self._vnodes)]
        for vkey in to_remove:
            h = _h(vkey)
            if h in self._owner:
                del self._owner[h]
                self._ring.remove(h)

    def route(self, key: str) -> str | None:
        if not self._ring:
            return None
        h = _h(key)
        i = bisect_right(self._ring, h) % len(self._ring)
        return self._owner[self._ring[i]]

Same loop with Ring(vnodes=200): node-a: 25.7%, node-b: 24.5%, node-c: 24.8%, node-d: 25.0%. The spread is now about 1.05x. That is what production systems use.

The cost is real:

Memory. Four physical nodes with 200 virtual nodes each is 800 entries in the sorted array and the owner dict. For 100 nodes with 200 vnodes that is 20,000 entries. Still fine, but visible.
Lookup. O(log V*N) instead of O(log N). For 20,000 entries log_2(20000) ~= 14.3 instead of log_2(100) ~= 6.6. About twice the comparisons per route. In practice the binary search runs in microseconds either way.
Add/remove. Adding a node now means inserting 200 ring entries instead of one. With insort that is O(V * log(V*N)). Still fast for the scale where consistent hashing applies (hundreds of nodes), painful at the scale of millions of physical owners (but nothing else uses millions of physical owners either).

The number 200 is not magical. The standard advice is between 100 and 500 virtual nodes per physical node. Ketama (the consistent-hash library that came out of the memcached community) historically used 160. A figure I have seen quoted is that with 100 vnodes per node and 10 nodes, the load is within about 10 percent across nodes; with 200 vnodes, within about 5 percent. The trade-off is the usual one: more vnodes, more memory, smoother load.

Bounded movement, with virtual nodes

I want to verify the same property holds with virtual nodes. Add a fifth node to the four-node ring, count moved keys.

ring = Ring(vnodes=200)
for n in ["node-a", "node-b", "node-c", "node-d"]:
    ring.add(n)
before = {k: ring.route(k) for k in keys}
ring.add("node-e")
moved = sum(1 for k in keys if before[k] != ring.route(k))
print(f"with 200 vnodes, adding node-e moved {moved/100:.1f}% of keys")

Roughly 19 to 21 percent. Same as the naive case. Virtual nodes do not change the bounded-movement property, only the load uniformity. That is the lesson: the two properties are independent. Naive consistent hashing is bounded-movement but uneven; virtual-node consistent hashing is bounded-movement and even.

Where this gets used in practice

Three real-world systems where this lives, with the caveats I would put on each:

In a sharded cache like memcached or Redis, consistent hashing lives in the client. Each client computes the same ring, so each client routes the same key to the same shard without coordination. When a node fails, only the keys mapped to that node's vnodes are remapped; the other 80 percent of keys stay where they are, unaffected by the failure. This is the canonical use case and the reason most caches with more than three nodes do not use modulo sharding.

In a sharded database (the Cassandra family of designs), the ring is the data placement strategy. Each row has a partition key, the key hashes into a position on the ring, and the node owning that position holds the data. Moving a node into the cluster moves only the keys near that node's vnodes; the rest of the cluster keeps serving reads. The cost is that range queries are awkward (logically adjacent keys end up on different shards).

In an object store like a custom S3-style backend, consistent hashing decides which storage node holds which object. Often combined with replication: instead of routing a key to one node, route it to the next K nodes clockwise. That is also bounded-movement-friendly, because the K-replica set for a key changes only slightly when one node joins or leaves.

I want to be careful about citing specific real-system internals. Some real systems (Cassandra and DynamoDB lineage) are documented as using consistent-hashing-style placement. Others have more bespoke schemes. The point of this article is the algorithm and its properties, not a tour of which company uses it. If you are working in one of those systems, read its actual sharding doc.

What virtual nodes do not fix

Two things consistent hashing does not solve, that I have seen people assume it does:

Hot keys. If one key is read 100x more than the average key, all the consistent-hashing-and-vnodes work in the world will not help, because that key still routes to one node. The fix is replication or read-path caching at a different layer, not ring-shape tweaks.
Heterogeneous capacity. If node-a has twice the CPU of node-b, you want node-a to own twice as many keys. The fix is to give node-a twice as many virtual nodes (Ring(vnodes=400) for the big node, Ring(vnodes=200) for the small node). Most production consistent-hash libraries support a per-node weight for exactly this reason.

If you implement a toy without weights, write a comment in the README that the toy assumes uniform capacity. Future-you will thank present-you when someone tries to mix machine sizes and the load is suddenly off.

A third operational pain I learned about the hard way: the ring is shared state. Every client (in a client-side hashing setup) needs to compute the same ring to route the same key to the same node. If two clients have different views of the cluster (one knows about the new node, one doesn't), they will route the same key to different shards during the gap. For a cache that is invalidation chaos. For a sharded database that is data going to two places. The standard fix is a coordinator (a service registry, a Zookeeper-style watcher) that all clients consult, and a bounded propagation delay during which traffic is allowed to be slightly inconsistent. Designing the coordinator is more work than designing the ring; if I were doing this from scratch in 2026 I would lean on a managed service mesh or an existing service discovery layer rather than building it. The algorithm is the easy part.

A tighter implementation idea

The naive Ring.remove is O(V) per removed vnode because list.remove scans linearly. For 200 vnodes that is 200 * len(ring) operations. A real implementation either uses a sorted structure that supports O(log N) removal (a balanced BST or a treap) or rebuilds the ring after a batch of removals. For a toy, the linear scan is fine; for a production library, it is not.

The other production concern is hash quality. MD5 and SHA-1 are fine for this use case (we are not relying on cryptographic security). The choice of which prefix bits to use matters more for the load uniformity than for correctness. Most libraries use 32-bit ring positions, which is plenty.

The mental model that makes this click

The thing I would want a junior engineer to take away is this: consistent hashing is not about the circle. The circle is a visualization. The actual property is "a hash function maps keys to a key-space, and node ownership of regions of that key-space is robust to small numbers of nodes joining or leaving." The circle is one way to draw that. The arc is one way to draw a node's ownership region. The hash function is the only thing doing real work.

I have run into engineers who can recite the ring metaphor but cannot answer "why move only 1/(N+1) of the keys?". The answer is geometric: a new node's vnode positions partition existing arcs into smaller arcs; only the keys that fell in the boundary regions need to move. The arc-partitioning intuition is the load-bearing one.

What I would build next, after the toy

If I were turning the toy into a library, three things would go in next:

Production extensions, in priority order
  1. Per-node weights for heterogeneous capacity (more vnodes for bigger nodes).
  2. Replication: route(key, k) returns the next k unique physical nodes clockwise, for fault tolerance.
  3. Hash quality: use a faster non-cryptographic hash (xxhash, MurmurHash3) for routing-only paths.

The first two affect correctness (or at least durability and balance); the third affects throughput. None of them change the bounded-movement guarantee.

Where I have actually used this

The case where I reached for consistent hashing and was glad I did: sharding a write-heavy event ingestion service across eight Kafka topics with custom routing. The default Kafka producer's round-robin partition selection meant we had no co-location of related events; switching to consistent hashing on the user ID gave us all-events-from-one-user-on-one-partition ordering, and adding a partition (going from 8 to 12) only moved one third of the user load instead of all of it.

The case where I would not reach for it: any time the cluster size is fixed, the keyspace is small, and rebalance latency is not a concern. Three-node Redis with a fixed list of 100 keys does not need consistent hashing; modulo sharding is fine. The whole point of the algorithm is to handle resize cheaply, and if you never resize, the algorithm is ceremony.

A claim to take home

Consistent hashing is the right algorithm when the cost of moving a key is significant and the cluster size is changeable. That is most distributed caches, most sharded databases, most distributed object stores. It is the wrong algorithm when the cluster is fixed; modulo is simpler. The ring drawing in the textbook is a teaching aid, not a load-balancing strategy. The load uniformity comes from virtual nodes, the resize-friendliness comes from the hash, and the algorithm is two ideas glued together rather than one. Most failures I see in implementations are from teams who copied the ring drawing without reading the section on virtual nodes, then wondered why one shard always melted under load.

If I were teaching this in an interview-prep workshop, I would tell people to write the toy in twenty minutes, run the bounded-movement test, run the load-distribution test, and then read the resulting numbers back to the room. The numbers are the lesson. Twenty percent moved on add, five percent load spread with two hundred vnodes, an order of magnitude better than modulo sharding on the relevant axis. That is the whole pitch. The thirty pages of textbook prose around the ring metaphor are a teaching scaffold; the actual algorithm fits on a napkin and the actual reason it earns its complexity fits in two numbers.

Back to Articles