|
|
|
|
|
by Fiahil
2126 days ago
|
|
The Rust and python impl are fine. But I get it, Parquet may not be perfect or optimal or whatever. It works as a simple, typed, columnar format. We had to pick a single file format recommendation for sending 100GB+ tables on FTP servers or dropbox, scanning terabytes of useless stuff only to grap an key-value pair, and properly reading integer and UTF-8 columns. Turns out, Parquet is practical. Enough for users to start using it instead of CSV. It could be Avro, but it's just not as easy. |
|
I actually think Parquet is pretty great in practice, I just have some issues with the sheer volume of abstractions necessary to implement it. I just wish it was anything other than Thrift.
I would probably choose Parquet over anything else, though.