Hacker News new | ask | show | jobs
by psfried 1511 days ago
> But it is binary, so can’t be viewed or edited with standard tools, which is a pain.

I've heard this sentiment expressed multiple times before, and a minor quibble I have with it is that the fact that it's binary has nothing to do with whether or not it's a pain. It's a pain because the tools aren't ubiquitous, so you can't count on them always being installed everywhere. But I'd argue that sqlite _is_ ubiquitous at this point and, as others have mentioned, it's a _great_ format for storing tabular data.

JSON is also a fine choice, if you want it to be human readable, and I'm not sure why this is claiming it's "highly sub-optimal" (which I read as dev-speak for 'absolute trash'). JSON is extremely flexible, compresses very well, has great support for viewing in lots of editors, and even has a decent schema specification. Oh, and line-delimited JSON is used in lots of places, and allows readers to begin at arbitrary points in the file.

1 comments

JSON is good for structured data, but I prefer TSV for simple human-readable tabular data. In situations where it's the right choice, a TSV file consists of data and whitespace and nothing else. You can view and edit it with any imaginable tool, and there is no overhead in the form of delimeters and encodings distracting you from the data.
I really like LTSV. (That stands for labeled tab-separated values.)

LTSV is basically equivalent to a JSON object per line. You have columns consisting of a label, a colon, then a value. The columns are then separated by tabs. The value can be quoted. If you need a tab in the value, it goes inside the quotes.

As the http://ltsv.org/ suggests, I use it for logging, too, so that a log line is easily parseable and a log file is basically a table. Notice there are parsers for many languages, and there are several tools supporting it including fluentd.