|
That's a good point about the critical path - I was thinking it would be a bit bigger and slower since you'd have to decode it, but I hadn't realized what an impact that would probably have. No bit-perfect round trips is also absolutely horrifying and I would never have even thought that would be a thing. >Compiler writers and tool authors are perfectly comfortable working with binary file formats. There's nothing more inherently future-compatible about JSON than a forward-compatible binary format like flatbuffers. Having to escape and then unescape naughty bytes is a huge downside for text-based formats that are hardly ever read by humans. Well, the problem isn't binary or non-binary, the problem is that these formats like ELF are, apparently, really annoying to deal with, have weird limitations, and are difficult to extend. The reason I thought of JSON in particular is because it doesn't really need to be extended to encode anything (unless you include escaping or base64 encoding binary data you want to put inside of a JSON document as "extending" it). You can encode all of the fields in ELF (or any other format) inside of JSON, while it doesn't make sense to consider the converse because ELF has fixed fields with fixed meanings. That's the problem with these bespoke binary formats like ELF - they're not designed to encode arbitrary schemas of data, they're designed for very specific tasks and then when they get used outside of their intended environment, we get problems like have been described in this thread. Nobody has ever had these problems with a JSON document - maybe with something that consumed one, but the file format itself simply does not have the same kind of limitations like ELF does. It has different limitations, but they're not of a fundamental and semantic nature like they are in a more rigid format. You're right that it would be a problem to have to escape/unescape every section every time you wanted to run something because that's very slow, but I think that's basically the only problem that these bespoke binary formats solve. If that's the case, I wonder why something like Matroska wouldn't work for binaries? My understanding is that it's basically binary XML and allows for basically a completely arbitrary dictionary structure. It doesn't have nice tooling like JSON or XML do, but there's no weird restrictions on things like field length that I'm aware of. I guess it doesn't exactly have any "momentum", though, but maybe the NixOS people will get sick enough of ELF to consider such a drastic solution :P |
This is nothing specific to binary formats, but specific to insufficiently extensible formats. Note that I specifically mentioned flatbuffers, which provide for extensibility while keeping parsing latency low.
Also, ELF was designed to be extensible by adding new sections. You could totally add functionality by adding a new section holding JSON data.
Don't confuse JSON with extensibility. I've seen plenty of headaches with poorly thought out JSON schemas where forward compatibility wasn't sufficiently well thought out. There are also tons of elegantly extensible binary formats. ELF is just old; much older than JSON. A new binary format would probably be more elegantly extensible.