Hacker News new | ask | show | jobs
by jamesblonde 907 days ago
You don't need a Spark deployment. The first reference implementations for reading and writing were in Spark.

Now, with PyIceberg, there is read support in Python. Write support should be merged very soon - https://github.com/apache/iceberg-python/pull/41 So, very soon, you will be able to read/write Iceberg tables in Python. I look forward to doing data transformations in Polars for data of reasonable scale (up to 100GB or so) and writing to Iceberg tables with PyIceberg. No Spark.

1 comments

Well, what about other languages? Every language needs bindings or a re-implementation? (i.e., iceberg tables are written/queried in-process as opposed to via a network API?)