23 03 2016
I think consistent hashing is pretty fascinating. It lets you define a ring of machines that shard out data by a hash value. Imagine that your hash space is 0 -Int.Max, and you have 2 machines. Well one machine gets all values hashed from 0 -Int.Max/2 and the other from Int.Max/2 -Int.Max. Clever. This is one of the major algorithms of distributed systems like cassandra and dynamoDB.
For a good visualization, check out this blog post.
The fun stuff happens when you want to add replication and fault tolerance to your hashing. Now you need to have replicants and manage when machines join and add. When someone joins, you need to re-partition the space evenly and re-distribute the values that were previously held.
Something similar when you have a node leave, you need to make sure that whatever it was responsible for in its primray space … Read more