Hacker News new | ask | show | jobs
by dvt 3530 days ago
Not sure what your point is (or the point of that presentation, for that matter).

Of course there are binary serialization formats that are faster than XML or JSON, and of course they're less error-prone. This has been known for about 40 years now.

JSON/XML are used precisely because people want a human-readable interchange format. For high-performance uses, consider Google's Protocol Buffers or Boost::serialize. You're acting like you just hackathoned the biggest thing since sliced bread, but that's exactly how payloads have been sent (until high-bandwidth made us all lazy) since the inception of the Internet.

3 comments

From experience, I think the whole "human-readable" idea is a bit overrated. All it means is that the format is entirely/mostly in ASCII. But if you have a hex editor, like all good programmers should, binary formats are not any less human-readable (or writable) nor more difficult to work with; and for some, even a text editor with CP437 or some other distinctive SBCS will suffice after a while. It's somewhat like learning a language; and if you are the one developing the format, it's a language that you create.

Then again, I grew up working with computers at a time when writing entire apps in Asm/machine language was pretty normal as well as other things which would be considered horribly impossible by many developers of the newer generation, and can mentally assemble/disassemble x86 to/from ASCII, so my perspective may be skewed... just a tiny little bit. ;-)

But the phrase is "human-readable" and not "programmer-readable".
A minor gripe with your comment, but as a programmer conceivably must be human, both conditions are satisfied when a programmer is capable of reading it.
I thought my point was clear - don't get involved parsing JSON; I agree with the OP, parsing JSON is a minefield. I went further by implying that it is also unnecessary when ease of reading isn't needed, and called out some alternatives. I think it's amusing that you mentioned protocol buffers - were you aware that when I mentioned flat buffers that they were built in relation to performance inefficiencies in the very protocol buffers that you mentioned?

We didn't just "hackathoned the biggest thing since sliced bread", btw, we took a real world example of exchanging a human readable format for a human-with-tools-readable one and saw a significant win. High-bandwidth also isn't as prevalent as you think, and yes, you're generally paying both performance wise and occasionally monetarily for the laziness you mentioned. But then, if you've known this for 40 years and don't know how to measure it, there's not much I can do for you in a comment.

I believe that is the point. Choose the right serialization strategy to fit the job. Most projects default to JSON regardless of how suitable. At some scale that should be revisited since the human-readable / performance trade-off equation can change.