| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jimsparkman 1525 days ago

Since duckdb is OLAP, it would directly compare with other columnar database technologies like Redshift, Presto/Athena, etc.

Most of these systems strongly encourage or outright enforce JSONL, so that’s the defacto standard, and most tooling or pipelines are going to generate that nowadays.

You can obviously still have a row of arrays, and different systems have slightly different approaches on how to deal with those. In Spark, this is referred to as “exploding”, in Presto you would cross join to unnest an array, in Redshift you can glob on the super type.

I’m not sure I have a particular favorite, only that the database support such an operation since it is a common occurrence.