Hacker News new | ask | show | jobs
by brunokim 667 days ago
I've read a comment here some years ago of someone discovering ASCII field delimiters and excited to use them. They then discovered that those characters are only used in three places: the ASCII spec, their own code, and the data from the first client where he tried to use this solution.

Any file format needs a well-specified escape strategy, because every file format is binary and may contain binary data. CSV is kinda bad not only because, in practice, there's no consensus escaping, but also because we don't communicate what the chosen escaping is!

I think a standard meta header like follows would do wonders to improve interchangeability, without having to communicate the serialization format out-of-band.

``` #csv delim=";" encoding=utf8 quote=double locale="pt-BR" header=true ```

(RFC-4180 does specify that charset and header may be specified in the MIME type)

1 comments

To me it's wild that the problem was solved back in the early 1960s (and really, well before that) but everyone just ignored it because of reasons and now we're stuck with a sub-optimal solution.