Hacker News new | ask | show | jobs
by ghusto 805 days ago
Perhaps naive, but we escape with \ everywhere else, so why not here?

If you're typing in CSV manually, escape with \

If you're exporting to CSV, the program already know which part is data and which part is the next cell, so again the program can escape with \

1 comments

Because those of us that have to read your data would highly prefer you just emit standard¹ CSV, and not invent "CSV+my oddball customizations". If you're going to muck about outside the standard format, then you might as well just use DSV from the OP.

Most good implementations are flexible enough that they might be configurable to your proposed pseudo CSV. (Or even DSV. Or USV. Etc.) But I'd rather just not need to, and the sanest default for any CSV library is the standard format.

(Or even better … just emit newline-terminated JSON. Richer format, less craziness than CSV, parsers still abound.)

¹(RFC 4180. "," is field sep, CRLF is row sep. You can escape a comma or a CRLF by surrounding the entire field in double-quotes, and a double quote itself can be escaped by escaping the field, doubling the internal double quote.)

"Oddball customisation" is a bit rich, no? A backslash is the way things are escaped in most places. Why re-invent the wheel?

And why would you "highly prefer you just emit standard CSV"? What is the benefit to insisting adherence to the original standard, especially if the modification fixes something that is broken?

> A backslash is the way things are escaped in most places.

Sure … but not by CSV. Backslash is hardly the only way, nor is the doubled-quote escape mechanism particularly obscure, given its presence in popular formats like CSV, SQL, or YAML.

> why would you "highly prefer you just emit standard CSV"?

Why would you prefer a baroque format?

So that it goes through a standard parser without needing extra, additional configuration. It's just CSV, and moreso, it's CSV without hoops or surprises that need to be transmitted out of band somehow.

> What is the benefit to insisting adherence to the original standard, especially if the modification fixes something that is broken?

That's not the case here, though: using backslash doesn't fix something that's broken.