agreed! PostgreSQL is the wrong tool for large data without indexes. It's also the wrong tool for ultra low latency access. And small, non-persistent data. And and and...
That said, over time PostgreSQL has wildly expanded the range for which it's suitable and if you can and want to bet on the future, it's often a better bet than niche systems.
It's also important to remember that PostgreSQL is decades ahead of other systems in data virtualization and providing backward-compatibility to applications after changes; pushing down computation near the data and avoiding moving billions of rows into middleware, including a world class query optimizer; concurrent data access; data safety and recovery; and data management and reorganization, including transactional DDL. Leaving this behind feels like returning to the stone age.
Yup. Even gross abuses of Redshift run fine with appropriate roll ups and caching. At a past job we did it “wrong enough” that it took a while for a more state of the art solution to catch up. This is not to say the abuse of Redshift should have been done, but AWS has been abused a lot and the engineers there have found a lot of optimizations for interesting workloads.
But to pick the wrong DB tool in the first place and bemoan it as “not scalable” is a bit like complaining that S3 made for a poor CDN without looking at how you’re supposed to use it with Cloudfront.
Is ZeroETL not in early stages, still? I heard it replicates everything. No filtering yet on parts of the binlog (tables/columns). But other than that, i like the idea.
(I would like to know, where their ZeroETL originated from, usually AWS picks up ideas somewhere and makes it work for their offerings to cash in. A universal replication tool.)
That said, over time PostgreSQL has wildly expanded the range for which it's suitable and if you can and want to bet on the future, it's often a better bet than niche systems.
It's also important to remember that PostgreSQL is decades ahead of other systems in data virtualization and providing backward-compatibility to applications after changes; pushing down computation near the data and avoiding moving billions of rows into middleware, including a world class query optimizer; concurrent data access; data safety and recovery; and data management and reorganization, including transactional DDL. Leaving this behind feels like returning to the stone age.