Hacker News new | ask | show | jobs
by CharlesW 1284 days ago
As Wikipedia puts it, "CSV is widely used to refer to a large family of formats that differ in many ways". If there's a canonical standard, it appears to be RFC4180: https://www.rfc-editor.org/rfc/rfc4180
1 comments

It appears, but its not. I have not found single program so far that conforms only to this RFC and nothing else.

From the RFC itself:

   Status of This Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.
> I have not found single program so far that conforms only to this RFC and nothing else.

Wouldn't that be impossible, given that parsers have to accept all kind of bizarro CSV flavors? Maybe more importantly, do you know of a single program or single CSV library that doesn't support reading or writing CSV as defined by the RFC?

Yeah, any of them. Just add new line in the "cell" and then go jump from the bridge.
An "Internet Standard" is just a designation that has been given to an RFC that has been blessed in a certain way. See https://www.rfc-editor.org/ for more details, but the set of designations is:

    * Uncategorised
    * Historic
    * Experimental
    * Informational
    * Best Current Practice
    * Proposed Standard
    * Draft Standard
    * Internet Standard
Once an RFC reaches "Internet Standard" it is given a special designation, e.g. STD-63 is the standards designation for RFC-3629: UTF-8 < https://www.rfc-editor.org/info/std63 >. See https://www.rfc-editor.org/standards

Being an "Internet Standard" is kinda special, but not especially so. For example, IMAP4, originally specified in RFC-3501 in March 2003, updated many times since, and revised in RFC-9051 in August 2021, is still a "Proposed Standard" without an STD designation, nearly 20 years and dozens of interoperable implementations later.

"Rough consensus and running code" is how things get done.

RFC-4180 is plenty good enough a "standard" for people to decide to interoperate over. They just have to decide to do so.

(Note also that HTML5 is not an "Internet Standard" according to the IETF et al. The last version to get an RFC was HTML 2 in RFC-1866, designated "Historic". And interoperability was an issue for a while with later versions of HTML during the "Best viewed in Internet Explorer/Netscape Navigator" wars. To get interoperability like we eventually did, you don't need an "Internet Standard"; you just need implementers who want to interoperate, and are willing to favour it over lock-in, and even over strict backwards-compatibility.)

(Also, the "and nothing else" clause in your comment confuses me. Why not support other formats/variants also? "Be liberal in what you accept" is certainly something that you probably want to avoid if you're designing a new format/protocol that no-one else is using yet, but if you're working with a decades-old format that was traditionally poorly-specified, with millions of documents out in the wild, it's probably the best way to allow existing users to move forward.)