| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jurgenkesker 906 days ago
	Wondering why nothing is said about compression? For MySQL you have Rocksdb with high ZSTD compression. Is the Postgres solution to just use compressed filesystem like ZFS? I could not find much info when researching Postgress, except for Toast compression, which didn't seem very strong. I have a huge DB which would take many TB but now runs on a 500GB footprint. Thanks to Rocksdb/MySQL. Pondering a move to Postgress, but seemed like a step down in this regard.

4 comments

insanitybit 906 days ago

This is what keeps me from postgres for large amounts of data. I use postgres to store metadata or small data (ie: "users logged in currently", "customers", etc).

If you want large datasets (petabytes) you really need to look elsewhere to something with better compression support or that tiers its data off to S3.

It's possible that with bcachefs we're like a decade away from "good fs in mainline kernel with fs compression" but right now it's not a great situation.

link

mrj 906 days ago

I'm not familiar with Rocksdb but if it's just a KV store you want, I'd probably just use Redis with compression. It is hard to beat.

Otherwise, Timescaledb offers compressible table chunks using table inheritance. It's really pretty slick, depending on your use-case.

link

karmakaze 906 days ago

RocksDB is a fork of Google's LevelDB, a KV store using Log Structured Merge Tree (LSM tree) which is great for high write workloads. The MySQL storage engine that uses RocksDB is called MyRocks[0].

[0] http://myrocks.io/

link

rad_gruchalski 906 days ago

YugabyteDB is a distributed Postgres and their storage layer is called DocDB, a slightly modified RocksDB.

link

philippemnoel 906 days ago

I think bringing RocksDB into Postgres as a Table access method would really be a powerful feature, as an alternative to heap

link

jtc331 905 days ago

I think the page features patch (which is a precursor to TDE, which Jonathan does mention) likely paves the way for better compression options.

link

riku_iki 906 days ago

so, why wouldn't you use btrfs or zfs?

link

abhibeckert 906 days ago

Because compression is robbing Peter to pay Paul. It saves money on storage, but costs money in compute.

And a lot of data is already compressed (images, etc) so might increase your compute cost without any storage savings.

Filesystems are often the wrong level in the stack to do compression. You want individual control over what data is compressed.

link

riku_iki 906 days ago

> so might increase your compute cost without any storage savings.

I agree with this point, but I think those online compression algorithms add very little overhead, so it is mostly about if Postgres adds enough other benefits compared to your current DB.

The bigger issue potentially is that ext4 is faster than zfs and btrfs itself regardless of compression.

link

abhibeckert 904 days ago

The compression penalty is actually massive. The bytes you're trying to read might require reading other bytes elsewhere on the disk in order to decompress them.

It's not just compute power, I was generalising with that statement. The real world speed hit can be as bad as read speeds an order of magnitude slower than if you disable compression.

You really only want to compress data where it makes sense to compress it.

link

riku_iki 904 days ago

> The bytes you're trying to read might require reading other bytes elsewhere on the disk in order to decompress them.

data is compressed and decompressed in blocks..

link