Hacker News new | ask | show | jobs
by adgjlsfhk1 499 days ago
It's really hard for me to think that this is a good solution. Normal users should be using Float64 (where there is no similar solution), and Float32 should only be used when Float64 computation is too expensive (e.g. GPUs). In such cases, it's hard to believe that doing the math in Float64 and converting will make anyone happy.
3 comments

> Float32 should only be used when Float64 computation is too expensive

Or when you're bottlenecked on memory and want to store each number in four bytes instead of eight.

Single-precision is too expensive for GPUs, unfortunately.
They may be too expensive for ML (or really not pareto-optimal for ML), but people use GPUs for a lot of things.
I think you mean double precision?
No.
Can you elaborate? Outside of AI workloads, almost all computations on GPUs are single precision. Double precision is pretty rare, and smaller precision is mostly useless outside AI (and obviously irrelevant for trig precision)
Unfortunately almost all computations on GPUs are AI workloads.
Uh, what?

The main usage of GPUs is graphics processing, and it's not even close. That is what Graphics Processing Units are built for. AI is probably the main use of datacenter GPUs today, but even that isn't "almost all" in comparison to the HPC work out there.

Off topic but tangentially related, here’s a fun fact, DEC Alpha actually ends up transparently converting IEEE single precision floats (S-float) to double precision floats (T-float, or register format) when performing registers loads and operations.