Hacker News new | ask | show | jobs
by askvictor 496 days ago
Another problem with signing JSON: you can have two different json objects that mean the same thing, and will do exactly the same thing in your code e.g. {"a": "foo", "b": "bar"} vs {"b": "bar", "a": "foo"}. Also, whitespace. Are there any standards for normalising json, so that two equivalent, but differently written JSON files will have the same signature?
3 comments

This is why, in one of my projects, I first stringified the JSON using built in JSON.stringify(your_json) function, then signed that string and sent the string, its signature, and public key to server. Server verifies the signature using the string and if passes, then uses JSON.parse(your_string) to get the original json.
The problem is the following two lines produce different outputs, despite having content that means the same thing:

    console.log(JSON.stringify({ x: 5, y: 6 }));
    console.log(JSON.stringify({ y: 6, x: 5 }));
I think the relevance of order is allowed to be up to each software's implementation:

https://datatracker.ietf.org/doc/html/rfc8259

Says:

> JSON parsing libraries have been observed to differ as to whether or not they make the ordering of object members visible to calling software. Implementations whose behavior does not depend on member ordering will be interoperable in the sense that they will not be affected by these differences.

So, different signature makes sense. But it should not be an issue as long as both software are calculating/validating the signature on the string and not json.

Usually, this is not a problem for signing.
Depends on your use case. We have this problem currently where I work.
That's canonicalization, and the article does mention it (but unfortunately does not offer much insight other than that it's hard).
The difficulty stems from you have to rewrite the encoding/decoding canonicalization library in every language you want to consume the data with as opposed to simply piggy backing off of default implementations and the language's standard crypto libs.

For example most JSON parsers default to interpreting numbers from JSON as floats or ints. but in the canonical format you would have to force all parsers to interpret them as exact decimal values. then determine how to encode them (is one hundred "100" or "1e2") etc.

Canonicalization is a pain in the ass. Write yourself a boatload of unit tests.
Honestly it should not really matter, the regex bait-and-switch solution seem like the most practical one, there is some trickery in checking that the magic key does not appear in the string already but they seems far easier