III · Hashing & lookup

Consistent hashing

Hashing that survives node changes.

The story

Akamai's founders (Karger and his MIT classmates) needed to map web requests to a fluid pool of servers — without rehashing the entire keyspace when one came or went. The answer: hash both keys and servers onto a circle; each key goes to the next clockwise server. Adding or removing a server only moves K/N keys.

How it works

Hash each node to one or more positions on a [0, 2^32) circle. Hash each key the same way. Each key is owned by the first node clockwise from its position. Virtual nodes (each physical node placed at many circle positions) smooth the distribution.

Where it lives

Every distributed cache: Memcached cluster sharding, Redis Cluster, Cassandra, DynamoDB, Riak. Most CDNs. Akamai's original use case.

The key insight

The whole thing was Karger's 1996 thesis. A formerly minor academic problem, and now half of distributed systems run on it.