Hacker News new | ask | show | jobs
by riku_iki 1056 days ago
> Hydra does take up ~17.5G for that benchmark and "PostgreSQL tuned" about 120G

you can run pg on compressed filesystem

1 comments

I'm sure you can, but AFAIK neither uses compression in that benchmark so it's a fair comparison. Even if filesystem compression would reduce that to 17.5G (doubtable), it won't be free in terms of CPU cycles, and no matter what it's still ~120G to load in memory, bytes to scan/update, etc.
my bet is that hydra uses compression inside already, otherwise it is hard to explain where difference comes from.

> it won't be free in terms of CPU cycles

it can reduce IO traffic significantly, and it can be very positive trade off depending on circumstances.

I had assumed that PostgreSQL is so much larger because it creates heaps of indexes (which is probably also why inserts are so much slower for it), but I don't really have a good way to confirm that quickly.
one can choose to not create "heaps of indexes".
At which point your performance will drop like a brick for these types of queries – I'm pretty sure these indexes weren't added for the craic.
it depends on your query obviously.

In general, I did very deep benchmarking of pg, clickhouse and duckdb, and I sure didn't make stupid mistakes like this: https://news.ycombinator.com/item?id=36990831

My dataset has 50B rows and 2tb of data, and I think columnar dbs are very overhiped and I chose pg because:

- pg performance is acceptable, maybe 2-5x times slower than clickhouse and duckdb on some queries if pg is configured correctly and run on compressed storage

- clickhouse and duckdb start falling apart very fast because they specialized on very narrow type of queries: https://github.com/ClickHouse/ClickHouse/issues/47520 https://github.com/ClickHouse/ClickHouse/issues/47521 https://github.com/duckdb/duckdb/discussions/6696