| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by milankl 2538 days ago

I want to add a few comments as most of the discussions here concerned the hardware implementation and only few pointed to possible applications. I work on weather and climate simulations, but my opinions should apply in general to CFD or PDE-type problems.

Yes, having redundant bitpatterns is not great when designing a number format, however, even for Float16 (half-precision), making use of the 3% NaNs is wise, but not going to be a gamechanger. Some others discussed pro/con for neg zero and also neg infinity: In my view you want to have a bit pattern that tells you that the answer you get is not real, but whether it's +/- Inf or some NaN is pretty much irrelevant. Using these bit patterns for something else sounds like a very reasonable approach to me. Furthermore, I've never come across a good reason for -0 in our applications.

When it comes to weather and climate models in HPC, I see the following potential for posits: Similar as BFloat16 is supported on TPUs, I could see Posit16 to be supported by some specialised hardware like GPUs, FPGAs etc. I'm saying that because for us it's not important to have a whole operating system running in posits (although I probably wouldn't mind) but to have them for some performance critical algorithms. Unfortunately, weather and climate models are far more complex than some dot products and we usually have to deal with a whole zoo of algorithms causing weather and climate models to cover easily several million lines of code. Now let's say we know our model spends 20% of the time in algorithm A which only requires a certain (low) precision to be stable and to yield reasonable results, then it would be indeed a big game changer if we could run this algorithms in, say, 16bit. In exchange of precision for speed we would probably want to push things to the edge, i.e. if we can just about do it in 16bit, then we should. Now there are several 16bit formats: Float16, BFloat16, Posit16, Posit16_2 (with 2 exp bits), and technically also Int16. Let's forget about the technical details of these formats and let's focus on where they actually considerably differ: What is the dynamic range and where on the real axis do I get how much precision to represent numbers. Yes, from a computer science perspective also the technical details matter, but from our perspective most of it is pretty irrelevant and what actual matters are these two things: dynamic range and where is the precision. Because these two really determine whether your algorithm is gonna crash or whether you can use it operationally on your desktop computer or in a big fat $$$ supercomputer.

For PDE-type problems (that includes CFD and also weather and climate models) I came within the last year of my research to the following preliminary conclusions regarding dynamic range and precision with respect to the above mentioned formats:

Int16: Let's forget about it. Float16: The precision is okay, but rarely needed towards the edges of the dynamic range. Floatmin might work, however, floatmax with 65504.0 is easily a killer. Might work with a no-overflow rounding mode and smart rewriting of algorithms to avoid large numbers. BFloat16: For our applications having only 7 significant bits is not enough, I didn't come across a single sophisticated algorithm that works with BFloat16. Posit16 (with 1 exp bit): Great, puts a lot of precision where it's needed but also allows for a reasonable dynamic range. Posit16 (with 2 exp bits): Probably even better, the sacrifice of a bit precision in the middle is fine and the wide dynamic range gives it the potential to also work with algorithms that are hard to squeeze into a smaller dynamic range.

In short, posits actually fit much better the numbers our algorithms produce. And this can indeed be the game changer: If a GPU supports posit arithmetic and we can run algorithm A on it in 16bit: Wonderful, contract sold! But if we couldn't with BFloat16 or Float16 than there is no future for 16bit in our field.

I explain more about this in this paper: dx.doi.org/10.1145/3316279.3316281

And there are two talks which tell a similar story: https://www.youtube.com/watch?v=XazIx0cMVyg https://www.youtube.com/watch?v=wp7AYMWlPLw

or simply drop me an email if you have questions (unlikely respond here) that you find on my website: milank.de