Hacker News new | ask | show | jobs
by turtledragonfly 1093 days ago
> If you're calculating millimeters, you want to be using .001 and .002, not 1.001 and 1.002.

I think if millimeters are what's important, one should represent them as '1' and '2', no? That's what I meant by keeping things near 1 (apologies for my clumsy language). I mean whatever unit you care about should be approximately "1 unit" in its float representation.

But yes, thank you for helping enunciate these things (:

1 comments

`0.001` and `0.002` are essentially equally accurate as `1` and `2`. `1.001` and `1.002` are worse.

In general, multiplicative scaling is useless with floating-point (*); but shifting the coordinate offset (additive, i.e. translation) can be highly useful. You want to move the range of numbers you are dealing with so that is becomes centered around zero (not near 1!). E.g. in Kerbal Space Program, the physics simulation of individual parts within the rocket needs to use a coordinate system centered on the rocket itself; it would be way too inaccurate to use the global coordinate system centered on the sun.

(*) The exception is if you need to keep decimal fractions exact, e.g. if dealing with money. In this case, (if a better suited decimal floating-point is unavailable) you want to scale multiplicatively to ensure a cent is 1, not 0.01.

> `0.001` and `0.002` are essentially equally accurate as `1` and `2`. `1.001` and `1.002` are worse.

Well let me just be 100% clear: I never meant to suggest the `1.001` encoding, at any point in this exchange (:

> You want to move the range of numbers you are dealing with so that is becomes centered around zero (not near 1!)

Yes, I think I like that terminology better - "centered around" rather than "near".

The reason I didn't say 0 originally is because keeping numbers "near zero" in an absolute sense is not the goal. If your numbers are all close to 1e-8, you would do well to scale them so that "1 float unit" is the size of the thing you care about, before doing your math. I think that is what you are saying in your cents example, too. So, the goal is about what "1 unit" means, not being specifically near a certain value. That's where the "1" in my original phrasing comes from; sorry for the confusion.

I don't really agree with how you're framing things. If "1" is the size you care about, then in single precision you can use numbers up to the millions safely, and in double precision you can use numbers up to a quadrillion safely. (Or drop that 10x if you want lots of rounding slack.) You're not trying to stay near 1 or centered around anything. You're trying to limit the ratio between your smallest numbers and your biggest numbers. And it works the same way whether the unit you care about is 1 or the unit you care about is 1e-8. If you kept your smallest numbers around 1e-8 there wouldn't be any downside in terms of calculation or accuracy.
I suppose implicit in my assumptions is that if "1" is the number I care about, that's the sort of values I'm going to be working with w/regard to my target data.

So, if I am doing some +1/-1 sort of math on a bunch of numbers, and those numbers are "far away" (eg: near 1e+8 or near 1e-8), then it is better to transform those numbers near "1 space", do the math, then transform it back, rather than trying to do it directly in that far-away space.

But yes, I suppose in your phrasing, that does come down to the ratio of the numbers involved — 1 vs 1e±8. You want that ratio to be as near 1 as possible, I think is what you mean by "limit the ratio"?

Well "1" won't consistently be "the typical amount you add/subtract" and "the typical number you care about" at the same time.

Like, a bank might want accuracy of a 1e-4 dollars, have transactions of 1e2-1e5 dollars, and have balances of 1e5-1e8 dollars.

That's three ranges we care about, and at most one of them can be around 1.0. But which one we pick, or picking none at all, won't affect the accuracy. The main thing affecting accuracy is the ratio between biggest and smallest numbers which in this case is 1e12.

If you set pennies to be 1.0, or basis points to be 1.0, or a trillion dollars to be 1.0, you'd get the same accuracy. Let's say some calculation is off by .0000003 pennies from perfect math. All those versions will be off by .0000003 pennies. (Except that there might be some jitter in rounding based on how the powers align, but let's ignore that for right now.)

There's something I'm not quite getting, here.

Let's take your bank example, with 32-bit floats. Since you say it doesn't matter, lets set "1" to be "1 trillion dollars" (1e12). A customer currently has a balance of 1 dollar, so it's represented as 1e-12. Now they make 100 deposits, each of a single dollar. If we do these deposits one-at-a-time, we get a different result than if we do a single deposit of $100, thanks to accumulated rounding errors. Ok, fine.

Now we choose a different "1" value. You say "which one we pick, or picking none at all, won't affect the accuracy," but I think in this case it _does_? In this second case, we set "1" to be 1 dollar, and we go through the same deposits as above. In this case, both algorithms (incremental and +$100 at once) produce identical results — 101, as expected.

I agree that there can be multiple ranges that we care about, which can be tricky, but I don't agree that it doesn't matter what "1" we pick.

But I am probably misinterpreting you in some way (: