| I am not sure I agree with the general idea that Postgres can't or even--albeit a bit less strongly--that it is hard to scale. Even in 2008 people were running petabyte-scale warehouses using Postgres: https://www.toolbox.com/tech/data-management/blogs/2-petabyt... Since 2008 improvements in parallel query execution (and numerous other improvements) in the core project plus the availability of forks/extensions which abstract and/or modify various bits for improving scalability (see Citus and Timescale) it's never been easier to scale Postgres to some truly staggering heights. While I wouldn't want to speak in absolutes, there are very few applications where I think Postgres wouldn't be a viable choice as a data warehouse. Emphasis on warehouse as I wouldn't want to suggest Postgres as an ideal candidate to be a data lake. The difference between them for me being whether or not the data is structured/processed. Similar in definition to this article: https://medium.com/@distillerytech/data-warehouse-vs-data-la... Personally, I have experience scaling core PostgreSQL (9.4) to handle ingestion of monitoring data for web servers to the tune of 2-3 terabytes a day. Not the grandest of scales, but enough to have seen a few bumps along the way...and, for what it's worth, I think it is surprisingly easy to scale. I wouldn't want to sign up to scale Postgres to handle exabyte data loads, but single digit petabytes? Sure. https://techcommunity.microsoft.com/t5/azure-database-for-po... And at petabyte-scale, I personally think it qualifies as a data warehouse. |