Hacker News new | ask | show | jobs
by teddyh 668 days ago
The article talks about reading and parsing CSV data of unknown variants, but then skips to the solution being using a different format altogether. But you can only switch to a different format if you are producing data, not if you are reading it!

And if you are in control of producing data, just produce strict RFC 4180-compliant CSV data and everybody will be able to read it just fine. There is no need to make your reader’s lives difficult by using yet another non-standard data format.

See also: <https://news.ycombinator.com/item?id=39679753>

4 comments

I just had a look at RFC 4180. This is the grammar they suggest:

> file = [header CRLF] record *(CRLF record) [CRLF]

I find it kind of wild that you have to have at least one record. Suppose I have a program that lists the events that occurred on a given day. How do I represent the fact that the program ran successfully but that there weren't any events on that day?

Event - Loging Started Aditionally, Loging Ended (for the day) allows you to check for intermediate crashes and startup errors, kinda like a pulse / health check

Not necessarily how I'd do it intuitively, but doesn't seem that crazy

Easy, count running the report as an event.
That isn't a problem of the file format. That's a problem of your process.
> yet another non-standard data format.

Tbf, ASCII delimiter characters have been around since the 1960s. They're not exactly reinventing the wheel

It’s the classic ‘self driving is easy when we modify the world to be what we want’ type solution.
"Politics is easy, the problem is we aren't working together!"
Sqlite isn't a standard per-se, but outputting an Sqlite db file, if you're writing the export code, is easy enough, and enough of a standard that I dare say you'd be doing fine to output .db files marked as sqpite.