Hacker News new | ask | show | jobs
by msm_ 811 days ago
That's true for every floating point number in every programming language you have ever used, though.

    $ python3
    Python 3.10.13 (main, Aug 24 2023, 12:59:26) [GCC 12.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 100000.000000000017
    100000.00000000001
2 comments

This is why Decimal exists:

  Python 3.8.10 (default, Nov 22 2023, 10:22:35) 
  [GCC 9.4.0] on linux
  Type "help", "copyright", "credits" or "license" for more information.
  >>> from decimal import Decimal
  >>> Decimal('100000.000000000017')
  Decimal('100000.000000000017')
For example:

  >>> import json
  >>> json.loads('{"a": 100000.000000000017}')
  {'a': 100000.00000000001}
  >>> json.loads('{"a": 100000.000000000017}', parse_float=Decimal)
  {'a': Decimal('100000.000000000017')}
And not every programming language offers a Decimal type and on most of those, there’s usually a performance penalty associated with it not to mention issues of interoperability and developer knowledge of its existence. For financial calculations, usually using integers with an implicit decimal offset (e.g., US currency amounts being expressed in cents rather than dollars), while other contexts will often determine that the inherent inaccuracy of IEEE floating types is a non-issue. The biggest potential problem lies in treating values that act kind of like numbers and look like numbers as numbers, e.g., Dewey Decimal classification numbers or the topic in a Library of Congress classification.¹

1. This is a bit on my mind lately as I discovered that LibraryThing’s sort by LoC classification seems to be broken so I exported my library (discovering that they export as ISO8859-1 with no option for UTF-8) and wrote a custom sorter for LOC classification codes for use in finally arranging the books on my shelves after my move last year.

Decimal is not arbitrary precision, though. It has many of the same issues, you'll just see them in different places.

  >>> Decimal('100000.00000000000000000000017') + Decimal('1')
  Decimal('100001.0000000000000000000002')
but serializing/deserializing decimal using the json module is futile
Why is it futile? It can be serialized/deserialized perfectly through its string representation.
> That's true for every floating point number in every programming language you have ever used, though.

Alright, if "you" have only ever used python. In C, for example, we have hexadecimal floating point literals that represent all floats and doubles exactly (including infinities and nans that make the json parser fail miserably).

If you use the same syntax as OP, C’s parser will also round that literal. The existence of a hex literal for floats is something orthogonal
> we have hexadecimal floating point literals that represent all floats and doubles exactly

How do you do that?

A couple of resources I found but which I’m not sure if are about exactly what you speak of

https://stackoverflow.com/questions/65480947/is-ieee-754-rep...

https://gcc.gnu.org/onlinedocs/gcc/Hex-Floats.html

Furthermore, what exactly do you mean by “all floats and doubles exactly”?

Yes, I was talking about what is described in your resources. You can do this:

    // define a floating-point literal in hex and print it in decimal
    float x = 0x1p-8;          // x = 1.0/256
    printf("x = %g\n", x);     // prints 0.00390625
    
    // define a floating point literal in decimal and print it in various ways
    float y = 0.3;             // non-representable, rounded to closest float
    printf("y = %g\n", y);     // 0.3 (the %g format does some heuristics)
    printf("y = %.10f\n", y);  // 0.3000000119
    printf("y = %.20f\n", y);  // 0.30000001192092895508
    printf("y = %a\n", f);     // 0x1.333334p-2
So for example if you make a variable that has the value parent commenter used

100000.000000000017

And then you print it.

Does it preserve the exact value?

Your question is ambiguous for two different reasons. First, this value is not representable as a floating-point number, so there's no way that you can even store it in a float. Second, once you have a float variable, you can print it in many different ways. So, the answer to your question is, irremediably, "it depends what you mean by exact value".

If you print your variable with the %a format, then YES, the exact value is preserved and there is no loss of information. The problem is that the literal that you wrote cannot be represented exactly. But this is hardly a fault of the floats. Ints have exactly the same problem:

    int x = 2.5;   // x gets the value 2
    int y = 7/3;   // same thing
So in other words, is it fair to say that this situation is not much different from what you get with Python?