Hacker News new | ask | show | jobs
by BorisTheBrave 2055 days ago
Parquet is intended as a file storage format primarily. When streaming, I think you are recommended to use Arrow, which is basically an in-memory Parquet. It supports putting the schema first and streaming a undefined number of rows.

https://arrow.apache.org/docs/python/ipc.html