Hacker News new | ask | show | jobs
by lucb1e 2588 days ago
> Your "fix" makes it so that 1/10 + 2/10 == 3/10, but it still doesn't make it so that 1/3 + 1/3 == 2/3. So how is it actually "precise"?

Fair point! I don't think anyone ever put it quite this way. I mean, I knew that 1/3 cannot be represented in decimal and that decimal, like binary, is imprecise for the exact same reason, but I don't think anyone asked me about the definition of precise and why I think my version of addition fits that definition of precise better :). I think the answer is that, in code, we type in decimal: 0.1+0.2 and not 0b0.1+0b0.10 (if that would even be valid syntax). We work in base 10 most of the time, so we know that operations on 1/3 can not have infinite precision. But that's just something I came up with on the spot, I'm not sure that this is the true reason why it feels more correct.

> It doesn't really matter whether it would make the average website slower, since the average website should not be using floats

Fair enough about floats, but why about arbitrarily large integers? The feature being introduced could have been introduced as 'works out of the box' instead of 'opt in using the n suffix'.

Actually, I just realized it would probably break code that does bit shifts. Maybe that's why bigint is not the default?

1 comments

> I think the answer is that, in code, we type in decimal: 0.1+0.2

I think you are exactly right. People think of decimals as being the "actual", "primary", "fundamental" numbers, and binary as being an imperfect representation of those. Whereas in reality, both binary and decimal are imperfect representations of rational numbers, and we only think of decimal as being more fundamental because of our writing system.

> Fair enough about floats, but why about arbitrarily large integers

How exactly would you represent them? The best way I can think of is:

    struct BigInt {
        int64_t first_64;
        char *data; // pointer to extra, dynamically allocated data
        int data_len;
    };
This would allow you to avoid doing a dynamic allocation for the most common case of being under 64 bits. And the addition algorithm would probably special case that too, and look something like this:

    BigInt add(BigInt x, BigInt y) {
        if (x.data_len == 0 && y.data_len == 0) {
            int64_t new_val = x.first_64 + y.first_64;
            if (overflow_signaled()) {
                return add_slow_path(x, y);
            }
            return {new_val, 0, nullptr};
        } else {
            return add_slow_path(x, y);
        }
    }
As you can see, there is a ton of complexity here, even just for the simplest possible case. Replacing what was before literally just one instruction, e.g. addq %rbx, %rcx . Also, each number is represented by a 20-byte struct now instead of 8. So now each 64-byte cache line can fit only 3 values instead of 8. Because of all this, it would be dramatically slower.

This is just for the easiest case of no overflow! If you overflow and have to then go allocate memory dynamically and loop over it, it would of course be even worse.