| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by askvictor 496 days ago
	Another problem with signing JSON: you can have two different json objects that mean the same thing, and will do exactly the same thing in your code e.g. {"a": "foo", "b": "bar"} vs {"b": "bar", "a": "foo"}. Also, whitespace. Are there any standards for normalising json, so that two equivalent, but differently written JSON files will have the same signature?

3 comments

busymom0 496 days ago

This is why, in one of my projects, I first stringified the JSON using built in JSON.stringify(your_json) function, then signed that string and sent the string, its signature, and public key to server. Server verifies the signature using the string and if passes, then uses JSON.parse(your_string) to get the original json.

link

askvictor 494 days ago

The problem is the following two lines produce different outputs, despite having content that means the same thing:

    console.log(JSON.stringify({ x: 5, y: 6 }));
    console.log(JSON.stringify({ y: 6, x: 5 }));

link

busymom0 494 days ago

I think the relevance of order is allowed to be up to each software's implementation:

https://datatracker.ietf.org/doc/html/rfc8259

Says:

> JSON parsing libraries have been observed to differ as to whether or not they make the ordering of object members visible to calling software. Implementations whose behavior does not depend on member ordering will be interoperable in the sense that they will not be affected by these differences.

So, different signature makes sense. But it should not be an issue as long as both software are calculating/validating the signature on the string and not json.

link

heinrich5991 494 days ago

Usually, this is not a problem for signing.

link

askvictor 488 days ago

Depends on your use case. We have this problem currently where I work.

link

lxgr 496 days ago

That's canonicalization, and the article does mention it (but unfortunately does not offer much insight other than that it's hard).

link

hansonkd 494 days ago

The difficulty stems from you have to rewrite the encoding/decoding canonicalization library in every language you want to consume the data with as opposed to simply piggy backing off of default implementations and the language's standard crypto libs.

For example most JSON parsers default to interpreting numbers from JSON as floats or ints. but in the canonical format you would have to force all parsers to interpret them as exact decimal values. then determine how to encode them (is one hundred "100" or "1e2") etc.

link

hinkley 494 days ago

Canonicalization is a pain in the ass. Write yourself a boatload of unit tests.

link

afiori 494 days ago

Honestly it should not really matter, the regex bait-and-switch solution seem like the most practical one, there is some trickery in checking that the magic key does not appear in the string already but they seems far easier

link