| > Like real distribution, when you add more servers writes distribute as well? There's a really interesting suggestion in The Mythical Man Month. He suggests that instead of hiring more programmers to work in parallel, maybe we should scale teams by keeping one person writing all the code but have a whole team supporting them. I don't know how well that works for programming, but with databases I think its a great idea. CPUs are obnoxiously fast. Its IO and memory which are slow. I could imagine making a single CPU core's job to be simply ordering all incoming writes relative to each other using strict serialization counters (in registers / L1 cache). If the stream of writes coming in (and going out) happened over DPDK or something, you could probably get performance on the order of 10m-100m counter updates per second. Then a whole cluster of computers sit around that core. On one side you have computers feeding it with writes. (And handling the retry logic if the write was speculatively misordered.) And on the other side you have computers taking the firehose of writes, doing fan-out (kafka style), updating indexes, saving everything durably to disk, and so on. If that would work, you would get full serializable database ordering with crazy fast speeds. You would hit a hard limit based on the speed of the fastest CPU you can buy, but I can't think of much software on the planet which needs to handle writes at a rate faster than 100m per second. And doesn't have some natural sharding keys anyway. Facebook's analytics engine and CERN are the only two which come to mind. |
Calvin is an interesting alternate design that puts "reach global consensus on transaction order" as its first priority, and derives pretty much everything else from that. Don't even need to bottleneck through a single CPU.
http://cs-www.cs.yale.edu/homes/dna/papers/calvin-sigmod12.p...
Like VoltDB, the one huge trade-off is that there's no `BEGIN ... COMMIT` interaction where the client gets to do arbitrary things in the middle. Which would be fine, if programming business logic in the database was saner than with Postgres.