Hacker News new | ask | show | jobs
by yonatank 2752 days ago
Just a heads up to anyone using jq - I've previously spent a couple of hours debugging a problem because jq uses float64 to store integers (which might lead to rounding-errors/overflows). For example:

  echo 1152921504606846976 | jq                                                            
  1152921504606847000
3 comments

This is an artifact of JavaScript, which even as of ES6 uses IEEE 754 double-precision floats for all numeric values. jq likely uses the same implementation internally for compatibility reasons and to avoid surprises of a different kind.

See https://www.ecma-international.org/ecma-262/6.0/#sec-ecmascr...

BigInt in top browsers now, not under a flag. Just sayin'!

https://brendaneich.com/wp-content/uploads/2017/12/dotJS-201... et seq.

I think an 'error out if overflow/truncation'-mode available as a command line flag could be useful if you just don't have any JS involved in the JSON pipeline.
Twitter had to switch their tweet id representation in the API to handle this - it was numeric and switched to strings.
Note: the links currently redirect to https://imgur.com/32R3qLv (image of a testicle and derogatory comment on HN)
Copy-pasting the link bypass this. It's using the HN referrer (I think?) to redirect to imgur
Yikes, that's nasty.
It is. But it's a problem of JSON itself, not just jq.
TIL: JSON has no specified number implementation: http://www.ecma-international.org/publications/files/ECMA-ST...

>JSON is agnostic about the semantics of numbers ... JSON instead offers only the representation of numbers that humans use: a sequence of digits.

So... anything is valid, per the spec.

JSON != JavaScript

> echo 1152921504606846976 | python -c 'import sys, json; print(json.load(sys.stdin))'

1152921504606846976

Python's json package != JSON

JSON: https://tools.ietf.org/html/rfc8259#page-8

The link says that it's up to the implementation, which means it's valid for Python's JSON implementation to support larger numbers.

It's less "interoperable" but not strictly invalid, by my read.

What the GP means is that JSON doesn't require an implementation to decode JSON integers as arbitrary-precision integers, to be "conformant JSON."

Therefore, you can't assume that if you pass some JSON through an arbitrary pipeline of JSON-manipulating tools, written in various languages, that your integer values will be passed through losslessly.

Therefore, you just shouldn't use JSON integer values when you know that the values can potentially be large. This is why e.g. Ethereum's JSON-RPC API uses hex-escaped strings (e.g. "0x0") for representing its "quantity" type.

It looks like JSON doesn't specifically define how numeric numbers should be stored. It just recommends expecting precision up to the double precision limits.

Still interesting to know it's not just a jq quirk.

Just because python has bigints and its stdlib json module supports bigints in json doesn't mean that is an interoperable thing to do.
The problem is made worse on the receiving end (the browser). I've ran into this issue when serialization libraries in Java send a 64-bit long value as a sequence of digits, then things over ~50 bits get silently truncated, you find out about it, then switch to quoted strings.
That's Python not adhering to the JSON standard as defined in the RFC.