Hacker News new | ask | show | jobs
by MrPowers 885 days ago
Yep, it is re-inventing database systems and you raise a great question.

At first glance, it seems like Delta Lake is inferior to a database. Most databases support multi-table transactions and Delta Lake only support transactions for single table. ACID transaction support is nothing new for a database.

Delta Lake is useful for large datasets and to keep costs low.

There are organizations that are ingesting hundreds of terabytes and petabytes of data into a Delta table every day. They're able to ingest data, perform upserts, and build realtime pipelines with this architecture.

Delta Lake is also free, so you only have to pay for storing the files in the cloud. This is a lot cheaper than a database usually.

Data warehouses are often packaged with a certain amount of shared RAM/storage. This can be a problem for a team with large workflows from many users. It's annoying to share compute with someone that's running a large experiment.

These are the main reasons enterprises shited to data lakes and now Lakehouse storage systems. See this paper to learn more: https://www.cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf