|
|
|
|
|
by mcdonje
1201 days ago
|
|
If it's tabular, self-describing formats have way too much overhead. I ran a query with a tabular result in the neighborhood of 100 columns by 215k rows, and exported it in multiple formats: - CSV: 166mb
- JSON: 795mb
That said, not all data is tabular.DuckDB already supports Parquet, which supports structs and is a very good format for storing data for reporting workloads. But JSON is a standard interchange format, so a lot of people are going to want to do something with JSON payloads they receive from API calls. I could definitely imagine a workload where you receive JSON from an API call, load it into DuckDB or similar to help with ETL, then store results in Parquet. |
|