Hacker News new | ask | show | jobs
by youngbum 745 days ago
Great point.

The advantages of loading “parquet” in “client side” are that 1) you only have to load data once from server and 2) the parquet files are surprisingly well zipped.

1) If you load once from server, no more small network requests while you are scrolling a table. Moreover, you could use the same duckdb table to visualize data or show raw data.

2) Sending whole data as a parquet file is faster through network than receiving data as json in response.

1 comments

I wonder how much of the benefit is from just not using json vs reducing round trips. I guess if you had a large table you could stream (smaller than normal) row groups from parquet? Not sure how much work that would be though.

I'm not sure what the optimal response size is for an http response, but probably there are diminishing efficiency returns above more than a MB or two, and more of a latency hit for reading the whole file. So if you used row groups of a couple of MB and then streamed them in you'd probably get the best of both worlds.