Hacker News new | ask | show | jobs
by lifthrasiir 595 days ago
Almost no serialization format is canonical by default---AFAIK bencode was the sole example that mandates the canonicalization. Instead, a canonical subset of the format is usually defined, which would of course exclude comments, unless comments themselves are considered semantic like XML. Yes, even XML has a canonical subset [1]!

[1] https://www.w3.org/TR/xml-c14n/

1 comments

There is the concept of “well formed xᴇɴᴏɴ” as outputted by the ᴅᴏᴍ and serializer which is deterministic as will suffice for canonicalization.