| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by 01HNNWZ0MV43FF 683 days ago
	Because json5L hasn't caught on yet and everything else has obvious flaws

2 comments

0cf8612b2e1e 682 days ago

I routinely interface with 1GB+ csvs. The size explosion for json would be huge. Disk IO aside, I assume a json parser is going to be slower to parse than csv.

link

imtringued 682 days ago

How would JSON cause a size explosion?

Nothing prevents you using ndjson where you define a header and then have an array per line.

link

0cf8612b2e1e 682 days ago

Nobody does this currently. You have now created another bespoke format. If I am going to need a custom parser/writer, I might as well lean on a binary format that has far stronger properties than a text based one.

link

ianburrell 681 days ago

JSONL is pretty common format. It makes sense for logs and anything else written incrementally.

JSON parsers are super common. They are simpler and faster than CSV because it is more regular. JSONL is simple to implement cause write by record and read by line.

The only difference with CSV are bracket characters around line and every string has quotes. The benefit is clear escaping rules including for newlines.

link

0cf8612b2e1e 681 days ago

JSONL is standard. Upthread said to write the header row and then make subsequent rows arrays. Of which I am not aware of anything that does this currently.

My objection to JSONL was about the increase in file size owing to repeating the keys.

link

ianburrell 681 days ago

JSON can write arrays in addition to hashes. JSON arrays are nearly identical to CSV. The only difference is brackets around li;es. There is no extra space wasted for keys.

link

im3w1l 682 days ago

Why do you use a text-based format at all at that size?

link

0cf8612b2e1e 682 days ago

You get what you get. Presumably when it started, they were a more modest size.

link

ryan_j_naughton 683 days ago

Eh, I'm skeptical of this statement.

CVS is explicitly about tabular data. JSON (including JSON5) is much more flexible. Flexibility can be great but also can be annoying. If you want tabular data, then a system that enables nesting isn't great.

link

yawnxyz 682 days ago

I love jsonlines but csvs are way more compact, since you don't have to repeat the column name for every line of data

link

sam_perez 682 days ago

I think the fact that a human can mostly just read csvs is an important part of their adoption, too.

link

ianburrell 681 days ago

You would write JSON arrays without names for tabular data. I don’t know if there is a standard way to do the header, but array of names would work. Or JSON Schema record.

link

jessekv 682 days ago

Rather than highlighting flexibility as the differentiator, I would say: CSV is for dense data, JSON is for sparse data. They are flexible in different ways. For example, CSV is very flexible when renaming a column title.

link