| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mschuetz 2589 days ago
	Because number is practically the same as double, and you can't just go and change double values to integer values. Also, doubles are fast. BigInts are not.

1 comments

lucb1e 2589 days ago

Python has only Number though, and that can get as large as fits in your memory. Not sure why they didn't ship this with decimals and enabled it by default in javascript.

I once wrote a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3). It was basically remembering the sign and handling some other notation like 1e100, splitting on the dot, and adding them up the normal way (as integers). To merge the result of the two additions, take care of the carry (if any) and concatenate the two numbers again with a dot.

So from my primitive understanding, if you're already storing a variable-length number (a space win in most cases, compared to the previous 4-byte float), you might as well store two variable-length numbers. Something like ABxx xxxx for the first byte, where if A=0 there is no decimal part (no second variable-length number) and if B=0 there is no next byte (the variable length part of it), then use Bxxx xxxx for each following byte. Then you have arbitrary size and decimals with perfect addition (no floating point approximations anymore).

Would it be that much slower, that it's an awful idea to do by default? Like, would it be noticeable on an average website with the usual godawful amount of javascript? Python gets away with it, so that would be weird. And you could still introduce special syntax (like, I don't know, suffixing an n maybe) to use old and faster primitive types for those who really need that.

umanwizard 2589 days ago

Decimals are conceptually similar to binary numbers, just with a different base. So no, they can’t represent all rationals accurately, and yes, they involve approximation.

For example, you can’t represent 1/3 in decimal, for the exact same reason you can’t represent 1/5 in binary. (1/3 is the infinitely repeating decimal 0.3333..., whereas 1/5 is the infinitely repeating binary 0.00110011001100...)

In general if you need to perfectly represent rational arithmetic you shouldn’t be using decimal or binary; you should have a type with an integer numerator and integer denominator. I don’t see the value of making arithmetic dramatically slower without actually solving the approximation issue.

If you are ever calling == on floating point numbers, you are doing something seriously wrong. Floating point numbers are supposed to be used for scientific and numerical computations where there is a notion of measurement error, and “exactly equal” is nonsense, so yes, speed is the entire point.

That’s why making a language without integers is such a serious mistake.

lucb1e 2589 days ago

> [first two paragraphs]

Do you really expect I mentioned this example, mentioned I wrote some code that solves this issue, and still never looked up or came across an explanation of why most programming languages behave this way?

> If you are ever calling == on floating point numbers, you are doing something seriously wrong.

Not sure if the 'you' is actually directed to me or if it could be replaced with 'one', but since I mention that it would be nice to do so, I guess I should feel addressed. Thanks for saying I'm doing things seriously wrong, that really helps.

Your comment completely steers any further comments down this thread towards explaining to me why floating point addition is fast but imprecise, rather than what I mentioned that I am actually wondering about: is it that much slower to do arbitrarily large integers by default (separate from the decimal issue), and secondarily solve the decimal issue at the same time (given the example I mention of the method that solves it, at least for addition, in roughly O(2))?

umanwizard 2589 days ago

> Do you really expect I mentioned this example, mentioned I wrote some code that solves this issue, and still never looked up or came across an explanation of why most programming languages behave this way?

Well, what you described does not actually solve the issue despite you claiming that it has, so I thought you might be confused. Which is not an insult -- many people are confused about this issue.

And you appear to have misunderstood my comment, which is not about explaining to you why binary arithmetic is imprecise, which you obviously already know. It is about explaining that decimal arithmetic is also imprecise, for the exact same reason, which is something that much fewer people understand.

Your "fix" makes it so that 1/10 + 2/10 == 3/10, but it still doesn't make it so that 1/3 + 1/3 == 2/3. So how is it actually "precise"?

To answer your question about speed: yes, doing things with arbitrarily-sized integers is much slower than doing them with floats (or normal integers for that matter). In the best case, you add at least one branch to every arithmetic operation. And a binary-coded decimal scheme like you described would be even slower still.

It doesn't really matter whether it would make the average website slower, since the average website should not be using floats (OR binary-coded decimals like your scheme) in the first place except for calculating layouts or other numeric calculations where asking whether 0.1 + 0.2 == 0.3 would never come up. For discrete computations they should be using integers -- that's what integers are for.

lucb1e 2589 days ago

> Your "fix" makes it so that 1/10 + 2/10 == 3/10, but it still doesn't make it so that 1/3 + 1/3 == 2/3. So how is it actually "precise"?

Fair point! I don't think anyone ever put it quite this way. I mean, I knew that 1/3 cannot be represented in decimal and that decimal, like binary, is imprecise for the exact same reason, but I don't think anyone asked me about the definition of precise and why I think my version of addition fits that definition of precise better :). I think the answer is that, in code, we type in decimal: 0.1+0.2 and not 0b0.1+0b0.10 (if that would even be valid syntax). We work in base 10 most of the time, so we know that operations on 1/3 can not have infinite precision. But that's just something I came up with on the spot, I'm not sure that this is the true reason why it feels more correct.

> It doesn't really matter whether it would make the average website slower, since the average website should not be using floats

Fair enough about floats, but why about arbitrarily large integers? The feature being introduced could have been introduced as 'works out of the box' instead of 'opt in using the n suffix'.

Actually, I just realized it would probably break code that does bit shifts. Maybe that's why bigint is not the default?

umanwizard 2589 days ago

> I think the answer is that, in code, we type in decimal: 0.1+0.2

I think you are exactly right. People think of decimals as being the "actual", "primary", "fundamental" numbers, and binary as being an imperfect representation of those. Whereas in reality, both binary and decimal are imperfect representations of rational numbers, and we only think of decimal as being more fundamental because of our writing system.

> Fair enough about floats, but why about arbitrarily large integers

How exactly would you represent them? The best way I can think of is:

    struct BigInt {
        int64_t first_64;
        char *data; // pointer to extra, dynamically allocated data
        int data_len;
    };

This would allow you to avoid doing a dynamic allocation for the most common case of being under 64 bits. And the addition algorithm would probably special case that too, and look something like this:

    BigInt add(BigInt x, BigInt y) {
        if (x.data_len == 0 && y.data_len == 0) {
            int64_t new_val = x.first_64 + y.first_64;
            if (overflow_signaled()) {
                return add_slow_path(x, y);
            }
            return {new_val, 0, nullptr};
        } else {
            return add_slow_path(x, y);
        }
    }

As you can see, there is a ton of complexity here, even just for the simplest possible case. Replacing what was before literally just one instruction, e.g. addq %rbx, %rcx . Also, each number is represented by a 20-byte struct now instead of 8. So now each 64-byte cache line can fit only 3 values instead of 8. Because of all this, it would be dramatically slower.

This is just for the easiest case of no overflow! If you overflow and have to then go allocate memory dynamically and loop over it, it would of course be even worse.

recursive 2589 days ago

> Python has only Number though, and that can get as large as fits in your memory.

This is not true.

    >>> type(1)
    <class 'int'>
    >>> type(1.5)
    <class 'float'>

lucb1e 2589 days ago

Oh, my bad. I thought those were abstracted away.

Still though, any int can get as large as you like by default, no weird -n suffix (that I never saw in any other language -- just like most of Javascript's other recently added syntax, by the way, it's the new Perl).

I do wonder where I got this notion of Number. Is there some other language that has this?

recursive 2589 days ago

I think stuff like wolfram language and mathematica probably have some "universal" numeric type.

However, I don't know a single mainstream application programming language that has a single numeric type that can handle: arbitrarily large integers, floating point values, and correct decimal arithmetic (0.1 + 0.2 == 0.3). I have at least a passing familiarity with probably about a dozen general purpose programming languages, and none of them can do it. If anyone knows of one, I'd be interested to learn about it.

dfox 2589 days ago

Common name for what you call "universal numeric type" is "number tower". Most lisp dialects have something like that. What that means is that you have classes for small integers (fixnum), arbitrary precision integers (bignum), fractions, floats, and even complex numbers along with the appropriate abstract base classes (eg. integer, rational, real...) and arithmetic operations transparently use the most appropriate type for the result, i.e. the result of "1 / 10" comes out as "1/10" (of type fraction) and not as float "0.500...something".

Python 3 has mostly same approach to number types.

lucb1e 2589 days ago

How odd to notice that my brain really messed that number type up. I could swear Python has a type called (capitalized) Number and that this handles arbitrarily large numbers as well as decimals. Seems like that 'memory' is completely fictional.

vaccarium 2589 days ago

Python does actually have an abstract type for numbers, and it is called Number: https://docs.python.org/3/library/numbers.html.

mschuetz 2589 days ago

> I once wrote a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3).

That's normal floating point math behaviour and fine in most use cases. You can't fix this for add, sub, mul and div without serious performance implications.

> Would it be that much slower, that it's an awful idea to do by default?

Yes. Easily an order of magnitude slower, perhabs two. The fixed size of uint32, float, double, etc. is an important property that allows computations to be fast. And since the size is fixed, you have to accept trade-offs in how or which numbers can be represented. Also, as someone else already mentioned, you can't even store 1 / 3 as a single numeric value. You'd have to store it as a rational number. Things are going to get super complicated once you combine rational numbers in computations and complex formulas.

I'm doing lot's of number crunching tasks with javascript, with performances of up to almost ~50% of equivalent C++ code. If js would have used a non-natively supported number format by default, it would have been useless for me.

lucb1e 2589 days ago

I did some testing with pypy, which also works with arbitrarily large integers but iirc does JIT instead of interpretation (like cpython would do), so that should be similar to JS in V8 except that it has arbitrarily large integers.

    a = Math.pow(2, 1023)
    t = new Date().getTime()
    for (var i = 0; i < 1e7; i++) {
        a += i;
    }
    console.log(new Date().getTime() - t);

vs

    a = pow(2, 1024)
    t = time.time()
    for i in range(int(1e7)):
        a += i
    print(time.time() - t)

Both ran a bunch of times: pypy does it in 279ms and nodejs in 250. I chose 1024 for Python because that is where JS starts to return infinity, so the JS code does operations on a number just below that. The time seems to be spent in the loop, as an empty loop or a loop doing a+=0 is 20x faster.

Lowering the exponent to 100, JS spends 269ms and pypy 142. Not sure why that is, but having arbitrarily large integers doens't seem to make this arithmetic any faster.

I don't know how to quickly toy around with fraction-based floats, but at least for arbitrarily large integers, I'm not sure why we're going to have to put up with new syntax.

umanwizard 2588 days ago

25 nanoseconds is much longer than a normal double-precision addition, loop counter increment, and conditional jump back to the top of a loop should take, so there's something other than the time taken by additions that's going on in your benchmark. I'm not a Node.JS expert, but I suspect it's not getting JITted properly, or getting poorly optimized if so.

I tried in C:

    #include <math.h>
    #include <stdio.h>
    int main()
    {
      double a = pow(2, 100);
      for (double i = 0; i < 1e7; ++i) {
        a += i;
      }
      printf("%f\n", a);
    }

and timed it. The time taken was 17ms.

mschuetz 2588 days ago

Rearanged your js sample a bit, now it runs in ~26ms (first time) and ~11ms (subsequent times) instead of ~220ms in the chrome developer console.

    {
        let a = Math.pow(2, 1023)
        let t = performance.now();
        let max = 1e7;
        for (let i = 0; i < max; i++) {
         a += i;
        }
        console.log(performance.now() - t);
    }

Main problem was, that you should declare a with let.

That benchmark is a bit strange/flawed anyway. You're initializing a as pow(2, 1023), then adding numbers in the loop. But since a is already such a large double value, the result won't change. The numbers you add are too small to make a dent in the value of a, likely because a isn't an integer. It's a double with a limited precision for large integer values.

thaumasiotes 2589 days ago

> I once wrote a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3).

...this is a terrible, bad-faith way of making a complaint that isn't even valid. Computers do do proper addition. And they handle the exact example you give perfectly, if that's what you want. JavaScript might have difficulty with that problem, but that's because, unlike the computers it's implemented for, it has no integers.

What do you think "proper addition" would involve if I asked you to tell me 1/7 + 1/3, on paper?

tialaramex 2589 days ago

The correct answer to someone who insists upon "proper addition" is 10/21 and for it to be annoyingly slow so that they learn to be sure if they really care about "proper addition" or are just being awkward.

If you mean how should the machine do that, it can find the least common multiple of 3 and 7 (which is 21) and then convert both fractions to be in that denominator, then simplify if possible. This is, as I said, annoyingly slow, but if you want "proper" answers that's what you got.

I wouldn't bother because I'm aware that _Almost All Real Numbers are Normal_ and so they're usually completely impossible to express in this fashion anyway and we should stop our foolish pretence that you can add non-integers together and expect to get "correct" answers just because it can be done for some easy cases.

lucb1e 2589 days ago

I'm sorry but where did I insist upon proper addition? I'll annotate the parts of my post that might be mistaken for it:

> a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3).

Just saying they can't do it (I edited this: first I said that JS doesn't do it properly, but I thought that was rather too narrow. I guess 'computers' is too broad again. Pick a name, you know what I mean)

> [an explanation of what worked for me some years ago] Then you have arbitrary size and decimals with perfect addition (no floating point approximations anymore).

Again, just mentioning that this would solve it for addition, not saying this is the perfect way for life, the universe, and everything.

> Would it be that much slower, that it's an awful idea to do by default?

See, I'm not insisting on anything, I'm wondering and asking.

> Python gets away with it,

It doesn't do correct decimal addition either, so I'm not even focusing on resolving floating point inaccuracies, I'm more interested in "if it would be so slow to do arbitrary precision integers---oh and by the way, wouldn't it also solve this addition thing?"

Now, you also didn't exactly say that I was insisting, you said "someone who insists". So maybe this only applies to your parent comment. But every time I bring it up, people stumble over each other to tell me why it is this way. I already know why it is this way. There is a lot other words in the comment that one could reply to, and it's rather frustrating that it's completely overshadowed - every time - by people ignoring everything except those twelve magic characters: 0.1+0.2!=0.3.

thaumasiotes 2589 days ago

>> a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3).

> Just saying they can't do it (I edited this: first I said that JS doesn't do it properly, but I thought that was rather too narrow. I guess 'computers' is too broad again. Pick a name, you know what I mean)

You can say it, but that won't make it true. They can do it, and they do do it. Your comment is so much nonsense. The algorithm computers use to add 0.1 and 0.2 is the same algorithm that you use, which is, unsurprisingly, why they produce correct results.

> There is a lot other words in the comment that one could reply to, and it's rather frustrating that it's completely overshadowed - every time - by people ignoring everything except those twelve magic characters: 0.1+0.2!=0.3.

I'll point out again that I'm focusing on your completely unjustified claim that "computers can't do proper addition", which you didn't bother to include in "those twelve magic characters" that everyone is complaining about.

mschuetz 2588 days ago

>> You can say it, but that won't make it true. They can do it, and they do do it. Your comment is so much nonsense. The algorithm computers use to add 0.1 and 0.2 is the same algorithm that you use, which is, unsurprisingly, why they produce correct results.

Are you aware of floating point math? That's what most languages, and almost all languages aimed at performance, use. It's defined in the IEEE 754 standard and supported on a hardware level in many devices.

Here is a listing of the result of 0.1 + 0.2 in various languages: https://0.30000000000000004.com/

As the URL already indicates, most languages, including C, Rust, C++, Java, Javascript, Clojure, FORTRAN, Python, etc. evaluate this to 0.30000000000000004. I'm fine with this, I need fast rather than precise math. But it's not "correct".