Hacker News new | ask | show | jobs
by dbyte 501 days ago
The grandfather of protobuf. Lost in the tales of time.
2 comments

Grandfather of Protobuf is ASN.1
Very much so. Pretty much all of these protocols are simplifications of asn1 and in some cases (like protobuf) there are a handful of things that got lost because the wire formats didn’t have them as they didn’t need them. A schema indicator being the single biggest flaw in protobuf.
Why is the lack of a schema indicator the biggest flaw of protobuf?
It makes it impossible to write a general purpose dissector that takes captured messages or bytes and figure out how to parse it.

All they needed was a varint at the head of any marshaled from to at least provide some scoping clue.

If you parse a serialized protobuf byte array without having a .proto file, you have no way to dustinguish a byte string field from a nested message field. Thus you have no way to know how deep your parser should go.
Semi-related, one of the `imessage-exporter` contributors provided a great write-up on reverse engineering the handwritten and digital touch message protobufs [0]. The reconstructed proto files are [1] [2].

[0]: https://github.com/trymoose/handwriting2svg/blob/0eb56cf4582...

[1]: https://github.com/ReagentX/imessage-exporter/blob/beeb853b2...

[2]: https://github.com/ReagentX/imessage-exporter/blob/beeb853b2...

One usually has two grandfathers, so it still works out.
The telco industry, including GSM and its successors, uses ASN.1 widely.
iMessage uses a very strange amalgamation of typedstream (message content), keyed archives (app messages, sticker data), and protobufs (Digital Touch, handwriting) for different features. I wonder what motivated all of those design decisions.
This is stuff is such a PIA to parse. I assume it's just different teams doing different features over the years, and being alternately repulsed/seduced by each format. Probably features are implemented as libraries so there isn't a master oversight - they aren't trying to make iMessage's internal formats follow a consistent plan, just let all the libs coexist...
Maybe they should be repulsed, considering all of the journalists that are getting persecuted and/or murdered because they are getting pwned through iMessage serialization bugs :)
As someone who used to work on that team, it’s so interesting hearing thoughts from external public on the team.
I would love to hear your thoughts as an insider.
"Those who don't understand ASN.1 are doomed to reinvent it, poorly."

That said, it could be much worse --- JSON, or XML.