| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dm03514 507 days ago

I think iceberg solves a lot of big data problems, for handling huge amounts of data on blob storage, including partitioning, compaction and ACID semantics.

I really like the way the catalog standard can decouple underlying storage as well.

My biggest concern is how inaccessible the implementations are, Java / spark has the only mature implementation right now,

Even DuckDB doesn’t support writing yet.

I built out a tool to stream data to iceberg which uses the python iceberg client:

https://www.linkedin.com/pulse/streaming-iceberg-using-sqlfl...