intermediate12 min readUpdated 2026-06-08

Consistent Hashing

Consistent hashing is the backbone of distributed caching (Memcached), distributed databases (DynamoDB, Cassandra), load balancing, and CDNs.

Consistent Hashing

Consistent hashing distributes data across nodes so that adding or removing a node only moves a small fraction of keys, not all of them. Traditional hash-mod-N remaps almost every key when N changes. Consistent hashing maps both keys and nodes to a virtual ring, and each key goes to the nearest node clockwise — making it essential for distributed caches, databases, and load balancers.

Aspect	Details
What it is	Hash-based distribution that minimizes data movement when nodes are added or removed
When to use	Distributed caches (Redis cluster), partitioned databases, CDN routing, session-aware load balancers
When NOT to use	Small fixed number of servers; data requiring ordered access; when uniform distribution is unnecessary
Real-world example	Amazon DynamoDB partitions data with consistent hashing; Akamai CDN routes to edge servers with it
Interview tip	Always mention virtual nodes — they solve the uneven distribution problem that basic consistent hashing has
Common mistake	Forgetting virtual nodes, leading to unbalanced shards when a physical node joins or leaves
Key tradeoff	Near-zero data movement when scaling vs. slightly more complex routing logic

The Core Idea

Consistent hashing is a distributed hashing technique that minimizes key redistribution when nodes are added or removed from a system. In traditional hashing (key % N), adding one server reshuffles almost all keys. With consistent hashing, only K/N keys are remapped (where K is total keys and N is total nodes).

Step-by-Step Walkthrough

Hash ring with three physical nodes each having multiple virtual nodes, showing even distribution of keys across the ring and minimal key movement when a node is added — System architecture for Consistent Hashing

Imagine a clock face numbered 0 to 360. Hash each server's name to get its position: Server A → 90°, Server B → 210°, Server C → 330°. Hash each key to get its position: key1 → 120°. Walk clockwise from 120° — the first server you hit is Server B (210°), so key1 is stored on Server B.

When Server B is removed, only the keys between Server A (90°) and Server B (210°) need to move — they shift to Server C. Keys on Server A and Server C are unaffected. This is the magic of consistent hashing: minimal disruption.

Virtual nodes solve hot spots. Instead of 3 points on the ring, you place 300 (100 per server). This distributes load evenly even when physical servers have different capacities.

Why This Approach Wins

Adding a node to a consistent hash ring: new node takes position on ring, only keys between new node and its predecessor are remapped, all other keys stay on their current nodes — How Consistent Hashing works step by step

Hash ring: Both keys and servers are mapped onto a circular hash space (0 to 2^32). A key is assigned to the first server encountered clockwise on the ring.
Minimal disruption: When a server is added or removed, only the keys between it and its predecessor need to be reassigned — approximately K/N keys.
Virtual nodes: To solve uneven distribution, each physical server is mapped to multiple positions on the ring (100-200 virtual nodes). This ensures balanced load even with few physical servers.
Replication: In databases like Cassandra, data is replicated to the next N nodes clockwise on the ring for fault tolerance.

In Production

Amazon DynamoDB uses consistent hashing to partition data across storage nodes. Adding capacity just redistributes a fraction of the data.

Akamai CDN pioneered consistent hashing to route requests to the nearest cache server holding the requested content.

Comparison table for Consistent Hashing contrasting approaches, tradeoffs, and when to use each — Comparing key aspects of Consistent Hashing

Discord uses consistent hashing to assign guilds (servers) to backend processes for message handling.

Tradeoffs and Limitations

Load balance vs. Complexity: Virtual nodes improve balance but add memory overhead for the routing table.
Consistency vs. Availability: During rebalancing, some keys may be temporarily unavailable or stale.
Simplicity vs. Flexibility: Simple mod hashing is easier to implement but breaks badly during scaling.

Production Gotchas

Key lookup: hash the key to a ring position, walk clockwise to find the first node, route request to that node, on node failure walk to next node clockwise — Data flow through Consistent Hashing

Not using virtual nodes — bare consistent hashing can lead to very uneven distribution
Using a poor hash function — MD5 or SHA are fine; simple modular hashing is not
Forgetting about replication — consistent hashing alone does not provide fault tolerance

The Interview Angle

Explain how consistent hashing works and why it is better than modular hashing.
What problem do virtual nodes solve?
How is consistent hashing used in distributed caching?
How would you handle replication with consistent hashing?
What happens when a node fails in a consistent hashing ring?

Next Up

Component diagram for Consistent Hashing showing each building block and its responsibility — Key components of Consistent Hashing

How to Explain This in an Interview

Here is how I would explain Consistent Hashing in a system design interview:

Consistent hashing solves a specific problem: how do you distribute data across N nodes so that changing N does not remap everything? Imagine a ring from 0 to 2^32. Hash each node and each key onto this ring. Each key maps to the first node found clockwise from its hash position. When you add a node, only the keys between the new node and its predecessor move — everything else stays put. The practical improvement interviewers expect you to mention is virtual nodes: each physical node gets 100-200 positions on the ring, which smooths out the distribution. Without virtual nodes, one physical node going down dumps all its keys onto a single neighbor, potentially overloading it.

The Real-World Incident That Made This Famous

The concept of consistent hashing was formalized in a 1997 paper by Karger et al. at MIT, but it became famous through Amazon's 2007 Dynamo paper. Amazon needed to distribute shopping cart data across hundreds of servers. The naive approach — hash the key, mod by number of servers — had a catastrophic flaw: adding or removing one server caused almost every key to remap to a different server, invalidating the entire cache and overloading the database.

Interview preparation checklist for Consistent Hashing with key points to mention and mistakes to avoid — Interview tips for Consistent Hashing

During the 2004 holiday season, Amazon experienced exactly this problem. A server failure in their storage fleet triggered a massive cache miss storm because their key distribution relied on mod-N hashing. When one node went down and N changed from 100 to 99, approximately 99% of keys were remapped. The resulting database load spike cascaded across services, degrading the shopping experience during their highest-revenue period.

The Dynamo team solved this with consistent hashing on a virtual ring. Each physical server gets 150+ virtual nodes on the ring. When a server fails, only the keys that were assigned to that specific server get redistributed to the next nodes on the ring — approximately 1/N of the total keys, not N-1/N. This reduced the cache invalidation impact from 99% to about 1%. The Dynamo paper became one of the most cited systems papers ever, and consistent hashing became a standard building block for distributed databases (Cassandra), caches (Memcached), CDNs (Akamai), and load balancers.

How Senior Engineers Think About This

Think of consistent hashing as a circular clock with 2^32 positions. Each server is placed at specific positions on the clock (determined by hashing the server's ID). Each data key is also hashed to a position on the clock, and it is assigned to the first server you encounter when moving clockwise from that position.

Decision guide for when to choose Consistent Hashing and when alternative approaches are better — When to use Consistent Hashing

The key insight: when a server is removed, only the keys between it and the previous server on the ring need to move — and they simply move to the next server clockwise. When a server is added, it takes over keys from its clockwise neighbor. The rest of the ring is untouched.

Virtual nodes are the critical optimization. Without virtual nodes, you get uneven distribution (one server might own 30% of the ring while another owns 5%). By assigning each physical server 100-200 virtual positions on the ring, the load distribution becomes nearly uniform. The math works out well: with 150 virtual nodes per physical server, the standard deviation of load is about 5-10%, which is acceptable for most systems.

Senior engineers think about consistent hashing as solving the minimal disruption problem. When your system's topology changes (servers added, removed, or failed), you want to move the absolute minimum amount of data. This is critical for CDNs (where re-fetching content from origin is expensive), distributed caches (where cache misses hit the database), and distributed databases (where data migration takes time and consumes bandwidth).

Common Interview Mistakes

Tradeoff analysis for Consistent Hashing listing advantages, disadvantages, and real-world considerations — Advantages and disadvantages of Consistent Hashing

Mistake 1: Not explaining why mod-N fails. Start by explaining the problem: with mod-N hashing, adding one server remaps almost every key. Only then introduce consistent hashing as the solution. This shows you understand the motivation.

Mistake 2: Forgetting virtual nodes. Basic consistent hashing with one point per server creates extremely uneven distribution. Always mention virtual nodes (100-200 per physical server) and explain why they help.

Mistake 3: Not discussing replication. In practice, keys are replicated to N successive nodes on the ring (typically N=3). You need to explain how replication works with consistent hashing and what happens when the next N nodes include virtual nodes from the same physical server (skip it).

Mistake 4: Confusing consistent hashing with rendezvous hashing. Rendezvous hashing (highest random weight) is an alternative that also provides minimal disruption. Know the difference: consistent hashing uses a ring, rendezvous hashing scores every server for every key and picks the highest score.

Production deployment examples of Consistent Hashing at companies like Netflix, Google, and Amazon — Real-world examples of Consistent Hashing

Mistake 5: Not being able to analyze the math. You should be able to explain that with K keys and N servers, adding a server moves approximately K/N keys (the theoretical minimum). This is what makes consistent hashing optimal.

Production Checklist

Use 100-200 virtual nodes per physical server for even load distribution
Use a high-quality hash function (MD5, SHA-1, or xxHash) — avoid CRC32 which has poor distribution
Implement bounded load consistent hashing if you need strict load limits (Google's 2017 paper)
When a node fails, ensure only its keys are redistributed — verify your implementation does not trigger a full rehash
Monitor per-node key count and request rate — detect hot spots where certain hash ranges get disproportionate traffic
Plan for ring rebalancing when adding nodes: pre-warm the new node by copying data from the nodes that will give up key ranges
Implement replication by storing keys on N consecutive distinct physical servers on the ring
Keep a persistent record of ring membership so that nodes can reconstruct the ring after a restart
Test the impact of adding and removing nodes in staging — verify that only the expected percentage of keys are remapped
Consider jump consistent hashing for static clusters where nodes never fail (faster, zero memory overhead, perfect distribution)

Read the original source | Content from System-Design-Overview