Hacker News new | ask | show | jobs
by zaphirplane 2425 days ago
The text in a column can be quoted and inside the quotes there can be escaped quotes or commas.

It requires a lot of sed awk skills to merge two columns or delete a column. If at all possible

1 comments

Yes, the "CSV road to hell":

- comma separated, nothing escaped (crash when 1 column contains a comma)

- comma separated, quotes around all elements, quotes not escaped

- comma separated, double-quotes around all elements, double-quotes not escaped

- comma separated, quotes around some elements, quotes not escaped

- comma separated, double-quotes around some elements, double-quotes not escaped

- comma separated, quotes around all elements, quotes escaped (using '')

- comma separated, double-quotes around all elements, double-quotes escaped (using "")

- comma separated, quotes around some elements, quotes escaped (using '')

- comma separated, double-quotes around some elements, double-quotes escaped (using "")

- comma separated, quotes around all elements, quotes escaped (using \')

- comma separated, double-quotes around all elements, double-quotes escaped (using \")

- comma separated, quotes around some elements, quotes escaped (using \')

- comma separated, double-quotes around some elements, double-quotes escaped (using \")

And the Microsoft format (where comma == semi-colon):

- semi-colon separated, nothing escaped (crash when 1 column contains a comma)

- semi-colon separated, quotes around all elements, quotes not escaped

- semi-colon separated, double-quotes around all elements, double-quotes not escaped

- semi-colon separated, quotes around some elements, quotes not escaped

- semi-colon separated, double-quotes around some elements, double-quotes not escaped

- semi-colon separated, quotes around all elements, quotes escaped (using '')

- semi-colon separated, double-quotes around all elements, double-quotes escaped (using "")

- semi-colon separated, quotes around some elements, quotes escaped (using '')

- semi-colon separated, double-quotes around some elements, double-quotes escaped (using "")

- semi-colon separated, quotes around all elements, quotes escaped (using \')

- semi-colon separated, double-quotes around all elements, double-quotes escaped (using \")

- semi-colon separated, quotes around some elements, quotes escaped (using \')

- semi-colon separated, double-quotes around some elements, double-quotes escaped (using \")

And I'm not talking about some weird custom CSV variants to support multi-lines for example or any other "I want to fit a circle in a square" mentality.

I don't know why people doesn't simply create TSV file (Tab-separated). No characters espacing mess. MUCHHH easier to parse.

EDIT: Formatting