|
|
|
|
|
by data_ders
392 days ago
|
|
the manifesto [1] is the most interesting thing. I agree that DuckDB has the largest potential to disrupt the current order with Iceberg. However, this mostly reads to me as thought experiment:
> what if the backend service of an Iceberg catalog was just a SQL database? The manifesto says that maintaining a data lake catalog is easier, which I agree with in theory. s3-files-as-information-schema presents real challenges! But, what I most want to know is what's the end-user benefit? What does someone get with this if they're already using Apache Polaris or Lakekeeper as their Iceberg REST catalog? [1]: https://ducklake.select/manifesto/ |
|
it adds for users the following features to a data lake: - multi-statement & multi-table transactions - SQL views - delta queries - encryption - low latency: no S3 metadata & inlining: store small inserts in-catalog and more!