Skip to main content
SDMastery
intermediate10 min readUpdated 2026-06-03

Consistent Hashing

Consistent hashing is the backbone of distributed caching (Memcached), distributed databases (DynamoDB, Cassandra), load balancing, and CDNs.

Consistent Hashing system design overview showing key components and metrics
High-level overview of Consistent Hashing
Consistent Hashing

The Core Idea

Consistent hashing is a distributed hashing technique that minimizes key redistribution when nodes are added or removed from a system. In traditional hashing (key % N), adding one server reshuffles almost all keys. With consistent hashing, only K/N keys are remapped (where K is total keys and N is total nodes).

Step-by-Step Walkthrough

Consistent Hashing system architecture with service components and data flow
System architecture for Consistent Hashing

Imagine a clock face numbered 0 to 360. Hash each server's name to get its position: Server A → 90°, Server B → 210°, Server C → 330°. Hash each key to get its position: key1 → 120°. Walk clockwise from 120° — the first server you hit is Server B (210°), so key1 is stored on Server B.

When Server B is removed, only the keys between Server A (90°) and Server B (210°) need to move — they shift to Server C. Keys on Server A and Server C are unaffected. This is the magic of consistent hashing: minimal disruption.

Virtual nodes solve hot spots. Instead of 3 points on the ring, you place 300 (100 per server). This distributes load evenly even when physical servers have different capacities.

Why This Approach Wins

Step-by-step diagram showing how Consistent Hashing works in practice
How Consistent Hashing works step by step
  • Hash ring: Both keys and servers are mapped onto a circular hash space (0 to 2^32). A key is assigned to the first server encountered clockwise on the ring.
  • Minimal disruption: When a server is added or removed, only the keys between it and its predecessor need to be reassigned — approximately K/N keys.
  • Virtual nodes: To solve uneven distribution, each physical server is mapped to multiple positions on the ring (100-200 virtual nodes). This ensures balanced load even with few physical servers.
  • Replication: In databases like Cassandra, data is replicated to the next N nodes clockwise on the ring for fault tolerance.

In Production

Amazon DynamoDB uses consistent hashing to partition data across storage nodes. Adding capacity just redistributes a fraction of the data.

Akamai CDN pioneered consistent hashing to route requests to the nearest cache server holding the requested content.

Comparison table for Consistent Hashing showing key metrics and tradeoffs
Comparing key aspects of Consistent Hashing

Discord uses consistent hashing to assign guilds (servers) to backend processes for message handling.

Tradeoffs and Limitations

  • Load balance vs. Complexity: Virtual nodes improve balance but add memory overhead for the routing table.
  • Consistency vs. Availability: During rebalancing, some keys may be temporarily unavailable or stale.
  • Simplicity vs. Flexibility: Simple mod hashing is easier to implement but breaks badly during scaling.

Production Gotchas

Data flow diagram for Consistent Hashing showing request and response paths
Data flow through Consistent Hashing
  1. Not using virtual nodes — bare consistent hashing can lead to very uneven distribution
  2. Using a poor hash function — MD5 or SHA are fine; simple modular hashing is not
  3. Forgetting about replication — consistent hashing alone does not provide fault tolerance

The Interview Angle

  1. Explain how consistent hashing works and why it is better than modular hashing.
  2. What problem do virtual nodes solve?
  3. How is consistent hashing used in distributed caching?
  4. How would you handle replication with consistent hashing?
  5. What happens when a node fails in a consistent hashing ring?

Next Up

Key components of Consistent Hashing with roles and responsibilities
Key components of Consistent Hashing

The Real-World Incident That Made This Famous

The concept of consistent hashing was formalized in a 1997 paper by Karger et al. at MIT, but it became famous through Amazon's 2007 Dynamo paper. Amazon needed to distribute shopping cart data across hundreds of servers. The naive approach — hash the key, mod by number of servers — had a catastrophic flaw: adding or removing one server caused almost every key to remap to a different server, invalidating the entire cache and overloading the database.

Interview tips for Consistent Hashing system design questions
Interview tips for Consistent Hashing

During the 2004 holiday season, Amazon experienced exactly this problem. A server failure in their storage fleet triggered a massive cache miss storm because their key distribution relied on mod-N hashing. When one node went down and N changed from 100 to 99, approximately 99% of keys were remapped. The resulting database load spike cascaded across services, degrading the shopping experience during their highest-revenue period.

The Dynamo team solved this with consistent hashing on a virtual ring. Each physical server gets 150+ virtual nodes on the ring. When a server fails, only the keys that were assigned to that specific server get redistributed to the next nodes on the ring — approximately 1/N of the total keys, not N-1/N. This reduced the cache invalidation impact from 99% to about 1%. The Dynamo paper became one of the most cited systems papers ever, and consistent hashing became a standard building block for distributed databases (Cassandra), caches (Memcached), CDNs (Akamai), and load balancers.

How Senior Engineers Think About This

Think of consistent hashing as a circular clock with 2^32 positions. Each server is placed at specific positions on the clock (determined by hashing the server's ID). Each data key is also hashed to a position on the clock, and it is assigned to the first server you encounter when moving clockwise from that position.

Decision guide showing when to use Consistent Hashing and when to avoid
When to use Consistent Hashing

The key insight: when a server is removed, only the keys between it and the previous server on the ring need to move — and they simply move to the next server clockwise. When a server is added, it takes over keys from its clockwise neighbor. The rest of the ring is untouched.

Virtual nodes are the critical optimization. Without virtual nodes, you get uneven distribution (one server might own 30% of the ring while another owns 5%). By assigning each physical server 100-200 virtual positions on the ring, the load distribution becomes nearly uniform. The math works out well: with 150 virtual nodes per physical server, the standard deviation of load is about 5-10%, which is acceptable for most systems.

Senior engineers think about consistent hashing as solving the minimal disruption problem. When your system's topology changes (servers added, removed, or failed), you want to move the absolute minimum amount of data. This is critical for CDNs (where re-fetching content from origin is expensive), distributed caches (where cache misses hit the database), and distributed databases (where data migration takes time and consumes bandwidth).

Common Interview Mistakes

Pros and cons analysis of Consistent Hashing for system design decisions
Advantages and disadvantages of Consistent Hashing

Mistake 1: Not explaining why mod-N fails. Start by explaining the problem: with mod-N hashing, adding one server remaps almost every key. Only then introduce consistent hashing as the solution. This shows you understand the motivation.

Mistake 2: Forgetting virtual nodes. Basic consistent hashing with one point per server creates extremely uneven distribution. Always mention virtual nodes (100-200 per physical server) and explain why they help.

Mistake 3: Not discussing replication. In practice, keys are replicated to N successive nodes on the ring (typically N=3). You need to explain how replication works with consistent hashing and what happens when the next N nodes include virtual nodes from the same physical server (skip it).

Mistake 4: Confusing consistent hashing with rendezvous hashing. Rendezvous hashing (highest random weight) is an alternative that also provides minimal disruption. Know the difference: consistent hashing uses a ring, rendezvous hashing scores every server for every key and picks the highest score.

Real-world companies using Consistent Hashing in production systems
Real-world examples of Consistent Hashing

Mistake 5: Not being able to analyze the math. You should be able to explain that with K keys and N servers, adding a server moves approximately K/N keys (the theoretical minimum). This is what makes consistent hashing optimal.

Production Checklist

  • Use 100-200 virtual nodes per physical server for even load distribution
  • Use a high-quality hash function (MD5, SHA-1, or xxHash) — avoid CRC32 which has poor distribution
  • Implement bounded load consistent hashing if you need strict load limits (Google's 2017 paper)
  • When a node fails, ensure only its keys are redistributed — verify your implementation does not trigger a full rehash
  • Monitor per-node key count and request rate — detect hot spots where certain hash ranges get disproportionate traffic
  • Plan for ring rebalancing when adding nodes: pre-warm the new node by copying data from the nodes that will give up key ranges
  • Implement replication by storing keys on N consecutive distinct physical servers on the ring
  • Keep a persistent record of ring membership so that nodes can reconstruct the ring after a restart
  • Test the impact of adding and removing nodes in staging — verify that only the expected percentage of keys are remapped
  • Consider jump consistent hashing for static clusters where nodes never fail (faster, zero memory overhead, perfect distribution)

Read the original source | Content from System-Design-Overview

Consistent Hashing in .NET

Here is a production-ready consistent hashing implementation in C#:

text
public class ConsistentHash<T>
    private readonly SortedDictionary<uint, T> _ring = new();
    private readonly int _virtualNodes;

    public ConsistentHash(IEnumerable<T> nodes, int virtualNodes = 150)
        _virtualNodes = virtualNodes;
        foreach (var node in nodes) AddNode(node);

    public void AddNode(T node)
        for (int i = 0; i < _virtualNodes; i++)
            uint hash = ComputeHash($"{node}-vn{i}");
            _ring[hash] = node;

    public T GetNode(string key)
        uint hash = ComputeHash(key);
        // Find first node clockwise from the key's hash
        foreach (var kvp in _ring)
            if (kvp.Key >= hash) return kvp.Value;
        return _ring.First().Value; // Wrap around

    private static uint ComputeHash(string key)
        using var md5 = System.Security.Cryptography.MD5.Create();
        var bytes = md5.ComputeHash(Encoding.UTF8.GetBytes(key));
        return BitConverter.ToUInt32(bytes, 0);

Where .NET developers use this: The Microsoft Orleans framework (virtual actor model) uses consistent hashing internally to distribute grains (actors) across silos (servers). When you build a game backend or IoT system with Orleans, consistent hashing ensures that each actor instance lives on a specific server without explicit routing.

StackExchange.Redis also uses consistent hashing when connecting to Redis Cluster — the client library hashes keys to determine which Redis node to query, transparently routing your requests.

External Resources

Original Sourcearticle