Hacker News new | ask | show | jobs
by sj4nz 1244 days ago
Could be 8-bit posits may be enough. Has that been done? At scale. I do not know.
1 comments

Posits aren't the answer to any question worth asking.
Why not?
Original posits are variable width, making them nearly useless for high performance parallel computations. Later versions don't add anything of use for low precision neural networks, and lack of hardware support anywhere make them too slow for anything other than toying around.

See also http://people.eecs.berkeley.edu/~wkahan/UnumSORN.pdf and https://www.youtube.com/watch?v=LZAeZBVAzVw

So for reference to everyone, the way a fixed-width posit works is by unary-encoding the high bits of the exponent, then encoding the low bits if space is available, then encoding mantissa into whatever bits are left.

Near 1.0 they look like a normal float. Maybe they have an extra bit or two of precision. Then as you get closer to 0 or infinity, every time the exponent field would run out you instead reset it, at the cost of one bit of precision.

The main benefit is that they have a lot more dynamic range than an equivalent float. The downside is the further you go into that range, the less accurate your numbers are.

They can also trade off accuracy near 1 for accuracy far from 1. Different exponent widths represent different points on this scale.

At smaller bit sizes, they have a good mix of preserving precision while being quite hard to overflow.

Overall they're not very different from standard floating point. Most of the bold claims from Gustafson are outside the scope of the posit itself.

-

Naively I would expect them to be kind of useful, if we ignore the issue of hardware support. Do neural networks need to represent extreme values with just as much precision as non-extreme values? And is the risk of overflow mitigated sufficiently at the same time? If so, yeah, the whole idea is useless.

As far as I'm aware, the original unum proposals that Kahan was arguing against have been discarded by Gustafson and all that remains now for current advocacy is the posit type, which is essentially a floating-point type that's fixed-width with a variable-width exponent.

I don't know what the hardware costs of posits look like, since I'm not a hardware engineer, so I can't comment on that. For larger sizes, posits seem to be inferior to IEEE 754 floating-point. For smaller sizes (say 16-bit and smaller), posits may work better, as the limited size means that IEEE 754's scale invariant nature [1] isn't as relevant, and packing more distinct numbers into the same bitwidth is more valuable [2].

[1] Put simply, in a IEEE 754 number, it doesn't matter if you measure your distance in nanometers, meters, or light-years--you'll get the same relative error either way. This is emphatically not the case in posits, where your relative error depends on the scale of the numbers.

[2] Posits combine ±infinity and NaN into a single value, and also does away with -0.0. From a numerical perspective, this is actually pretty cringe--there's a useful distinction there (and Kahan's talk gives some examples here)--but by the time you're at small bitwidths, you're likely limiting yourself to situations where the utility of these special values are questionable.

As to [2] I am very skeptical of the value of all the NaNs, -0 and Infs floating point has. NaN breaks x==x which is a pretty fundamental relationship for numbers to have. +-Inf sound useful in theory, but in practice they rarely give you a more useful result than NaN or the maximum/minimum value of your type (returning Inf on overflow has infinitely more error than returning the largest positive value, and if that isn't meaningful than an Inf probably wasn't either). Once you've gotten rid of -Inf, it becomes clear that -0.0 is a mistake. It breaks the identity 0+x==x and 0-x==-x. Furthermore, IEEE specifies sqrt(-0.0)==-0.0 and log(-0.0)==-Inf which are both nonsensical if you consider -0.0 as a limit from the negatives. Floats also have the unfortunate property that inv(x) can be infinite for finite x.
The value of -0 as distinguished from +0 has a few uses. The most obvious one is preserving sign in the case of overflow. A less obvious use case is handling branch cuts. There are uses in a few more cases: I've heard it's occasionally useful in things like coordinate systems, since something like "0°5'3" W" can be stored as (-0.0, 5, 3) after explosion and still display correctly. It's definitely niche, but it does have its uses.

Returning a distinct value that retains the fact that it overflowed is quite useful--if you get that result out of the computation, you know you overflowed the computation rather than silently getting a meaningless result. Note in particular that infinities end up being sticky values: once a value goes infinite, it tends to stay infinity, which isn't true for largest finite values. Distinguishing between various kinds of "invalid" values turns out to be moderately useful in practice--I've used infinities a couple of times in my own code.

NaNs are useful in representing a different kind of error than overflowed computation. Now there is a lot of room to criticize IEEE 754 here: "x != x" was quite frankly a mistake (basically the primary reason for it was the creators wanted to make testing for NaN easier than calling isnan(x)...). sNaNs are of course an abomination that just makes things worse. Multiple NaN payloads were originally intended (in part) to let developers debug the sources of NaNs, but this requires support that never really materialized. However, NaN payloads did find new use in making NaN-boxing a useful technique, and dedicating an entire exponent to special values simplifies several numerical analysis lemmas.

>NaN breaks x==x which is a pretty fundamental relationship for numbers to have

NaN is not a number, so it should NOT satisfy "fundamental relationships for numbers to have".

>+-Inf sound useful in theory, but in practice they rarely give you a more useful resu

There are algorithms that are more performant using infs, and without having a way to denote overflow, you'd have to pre-check evedry operation to do serious numerical work, which basically cuts your performance in half.

>Once you've gotten rid of -Inf, it becomes clear that -0.0 is a mistake

>It breaks the identity 0+x==x and 0-x==-x.

No, you have some fundamental misunderstanding. IEEE explicitly guarantees these hold, even for -0.

> Furthermore, IEEE specifies sqrt(-0.0)==-0.0 and log(-0.0)==-Inf which are both nonsensical if you consider -0.0 as a limit from the negatives.

You're making up strawmen. -0 is not a "limit from the negatives" any more than +0 is a limit form the positives, which would break other made up requirements. That is why making up stuff that has zero bearing on what IEEE 754 specifies is arguing strawmen.

>Floats also have the unfortunate property that inv(x) can be infinite for finite x.

Integers have the same property: -(X) can not be the negative of X. So this is not a problem except in made up goofiness.

Every objection you post is a lack of understanding numerical analysis and the needs of actual scientific software.

So you're skeptical- do you write numerical software professionally? I do, and have, and will do it in the future. There are very, very good reasons for all of those pieces you don't see the need for.

There's a reason unums have not caught on with the field of numerical software or numerical analysis - they simply don't allow writing robust, performant software, they solve no real issues, and add significant problems.

> lack of hardware support

That seems fixable. Don't people make chips that do what you want a lot of? A chip with an array of 8-bit posit PUs could process a hell of a lot in parallel, subject only to getting the arguments and results to useful places.