Hacker News new | ask | show | jobs
by qaq 3401 days ago
Maybe I am missing some special point but a decent PG box will do 1,000,000+ TPS vs 3,000+ TPS here. When pgXact lands it will do close to 2,000,000 TPS. So reading all the posts about the amazing new db "X" that can do about N times less than PG on a multi-node cluster I get confused why the numbers are being presented as some sort of achievement.
3 comments

That doesn't sound right. Even the newest NVMe devices can't do 1M writes per second; they're maxing out at around 330k IOPS.

The 1M "TPS" you're referring to is a read-only benchmark (e.g. http://akorotkov.github.io/blog/2016/05/09/scalability-towar...). Those are reads (most likely from the buffer cache), not writes or transactions in any real sense.

330K iops for a single device you are very unlikely to be running a single device. There are Fusion IO models that can do 1M IOPS but they are on the exotic side. If you are optimising for throughput you can configure commit_delay so you will fsync multiple commits.
FaunaDB is faster in an unclustered, unreplicated configuration too, but that's not what FaunaDB is really for.

We should track disk IOPS, though, so you can do an apples-to-apples comparison of how much low-level throughput the database is driving. I believe the instance store disks in the EC2 C3 hardware class can support about ~20K write IOPS each.

Relaxing disk commits is a quick route to data loss. Might as well use mongo at that point.
commit_delay does not relax anything it's just increases latency to group multiple commits
Like others have commented, those numbers seem to be too high for writes.

On the other hand, the Fauna numbers don't seem that impressive to me. On a mid-2011 Macbook Air, I get 2600 transactions per second (read-committed) in PostgreSQL 9.6. Setup is as follows:

  CREATE TABLE IF NOT EXISTS foo(a TEXT, b TEXT, c TEXT, d TEXT);
  CREATE INDEX IF NOT EXISTS idx_foo_a ON foo(a);
  CREATE INDEX IF NOT EXISTS idx_foo_b ON foo(b);
  CREATE INDEX IF NOT EXISTS idx_foo_c ON foo(c);
  CREATE INDEX IF NOT EXISTS idx_foo_d ON foo(d);

  -- prepared statement, the inserted strings are 4 chars wide
  INSERT INTO foo(a, b, c, d)
  VALUES
    ($1, $2, $3, $4),
    ($5, $6, $7, $8),
    ($9, $10, $11, $12),
    ($13, $14, $15, $16);
These numbers are for one thread doing the writing.

Am I missing something?

Well, you're missing that in Faunas case the writes are durably stored on N machines. I.e. their system provides fault tolerance in case a machine fails. You can't really do the same thing with postgres (without trading off full ACID compliance).
True. Thanks for pointing that out. I think they probably should be putting the emphasis in their marketing on the fault tolerance i.s.o. the performance.
you sure can get reasonable close with 2 phase commit, and starting with next release and quorum commit you will be able to do exactly that.
Quorum commit is not the same as distributed strict serialization because replicas can desynchronize.

Additionally you have the bottleneck of a single master. Will it be possible to do a quorum read as well, and will every transaction on the master be doing it? Then you are starting to get closer, although many anomalies are still possible.

I agree that your PG numbers don't sound likely/factual -- the reason for your confusion is probably that somebody gave you untrue performance numbers for postgres or you're not comparing the same things. Is the 1m+ TPS something you measured yourself or "heard from a friend"?

If you ran the benchmark yourself, how did you achieve 1m durable writes/sec on a postgres machine/instance? [It's quite an achievement] On what kind of crazy hardware? How large was each write/row? Did you use the postgres network protocol to perform the writes?

By extrapolation writes are IO bound you don't need crazy expensive things to get to the needed number of IOPS Intel 750 is 230,000 random writes @ $320 per PCI-E SSD. 9 drive config is over 2,000,000 IOPS for less than 3K.
Yeah it's a bit more complex than that... The disk "IOPS" number on the box doesn't translate 1:1 or even linearly to number of committed durable transactions per second. You should try this with postgres and see how it goes.
If you land me 30K for the hardware I would be glad to do it :). For writing WAL it scales reasonably close to linear