|
|
|
|
|
by pherl
3612 days ago
|
|
The main concern that the deterministic serialization isn't canonical is due to the unknown fields. As string and message type share the same wire type, when parsing an unknown string/message type, the parser has no idea whether to recursively canonicalize the unknown field. The cross-language inconsistency is mainly due to the string fields comparison performance, i.e. java/objc uses utf16 encodings which has different orderings than utf8 strings due to surrogate pairs. Feel free to start an issue on the github site asking for canonical serialization with your use case. We may change the deterministic serialization with stronger guarantee (e.g. cross language consistency) or add another API for canonical serialization. |
|
(You can find the niche use case in a response to your sibling comment, BTW.)