|
|
|
|
|
by brian-bk
808 days ago
|
|
JSON(newline delimited or full file) is significantly larger than csv. With csv the field name is mentioned once in the header row. In JSON every single line repeats the field names. It adds up fast, and is more of a difference than between csv to parquet. |
|
The ability of JSON to do both objects and arrays is useful, for example the first line can be an object or array of objects describing the fields. Then there is less confusion between schema lines and data lines like there is with CSV.