Hacker News new | ask | show | jobs
by alexandreyc 795 days ago
Basically it's the same limitations as CSV.

At least you could use something less likely to appear in data as record sepator (like 0x1E)

Otherwise it's an interesting idea!

2 comments

0x1E is the record separator, in ASCII precisely for this purpose. Too bad it’s not popular, here we’re stuck with inferior TSV/CSV
I can't easily type that out - and once the format can't be read / editing in a simple text editor, I'm starting to lean towards a nice binary format like protobuf.
Strings can contain 0x1E, so it has exactly the same issues as a tab character but with all the downsides of it not being an easy, “simple” character.
As far as I know, thanks to quoting it is possible to put basically any data you want in a CSV.
The problem is there is no uniform standard for quoting and escaping in CSV, and different software uses different variants.
There is a standard, and it is very simple and easy to use.

Different software uses different variants because we're not allowed to have nice things and devs are too lazy to use something slightly more complicated than .split(',')

Though if you're going to ban some common characters anyway like TSV, you might as well use CSV and ban commas, newlines, and quotation marks.