|
|
|
|
|
by papaf
5951 days ago
|
|
Two things surprise me about this article - probably because I've misunderstood it and don't see the big picture. One is that there are master and slave databases and searches are done off the master - I've always seen them done off the slaves in other systems. The other is that they state that using MD5 doesn't allow for horizontal scaling. One of the qualities of MD5 is that all bits have an equal probability of being 0/1. Surely the last 1 or 2 bits can be used to indicate which server is holding the data? |
|
You can just use a few bits from an MD5 hash to decide server as long as you know how many servers you're going to have up front. The problem is that if you later wanted to add or remove a server, you would need to come up with a new scheme and move every piece of data around so it's on the right server (which would take days/weeks).
The more scalable/flexible solution is to use a consistent hashing algorithm (check out some of the papers on Chord) so that adding or removing a server doesn't require you to move as much data around.