Hacker News new | ask | show | jobs
by NoboruWataya 668 days ago
The only real benefit of CSV (other than that it is widely supported) is that it is easy for humans to read and write. The approach in this article solves the quoting problem, but also removes that benefit. If you have the power to move from CSV, surely JSON would be better if you need to keep the human readable/writable feature. And if you don't need it, there are other more featureful binary formats out there like parquet.
2 comments

You can’t put comments in JSON, while that’s fairly easy in CSV. This makes JSON unusable most of the time for human-editable data.
There is no such thing as a comment in CSV.
In many dialects there are. Usually you start the line with #.

Comments will happen. If your file format doesn’t allow comments, then people will make up an extension to allow it. This is true even for binary formats.

JSON is often good, but it also has potentially a lot of overhead, depending on how sparse the data is. For sparse data, it might be better. But for not sparse data, it will have the overhead of mentioning attribute names over and over again. Of course you could also have arrays in JSON, not writing attribute names over and over, but then you are basically back to a CSV inside the JSON file ...
>Of course you could also have arrays in JSON, not writing attribute names over and over, but then you are basically back to a CSV inside the JSON file ...

You're confusing the concept of tabular data with the file format. If the most natural way to represent tabular data is through a 2D array, then so be it. The vast majority of people aren't complaining about the fact that they have to hardcode the meaning of "the last name is written into the fifth column", they are cursing that the fifth column has suddenly shifted into the sixth column, because the first name contained a comma.

Where am I confusing the two?