Hacker News new | ask | show | jobs
by gigatexal 942 days ago
What about iceberg tables and a lake approach on GCS and then picking a querying engine?
2 comments

This is the way the industry is going. Table formats such as Delta, Hudi, Iceberg stored on cloud object stores.

Though it works amazingly well, it is certainly slower than ingesting the data to be stored and manipulated in native formats.

These are terms I’m sorta familiar with but not sure. Data lake = bunch of noise (everything), iceberg = generated tables or views to read relevant/hot data from the lake?
Basically that’s it. Yeah. If you can afford BigQuery just use that but otherwise building off of blob storage and bolting on query engines and catalogs makes for a flexible approach but I find BigQuery solves most problems rather well just throwing money at the problem lol