| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by oranlooney 2900 days ago
	That's great and all, but nobody needs a 32-bit anything in 2018. This undergraduate paper provides a magic number and associated error bound for 64-bit doubles: https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf

13 comments

dagenix 2900 days ago

That's not really accurate. Even in cases were 32 bit and 64 bit operations are equally fast on the CPU, 32 bit values still take up half the memory. For many workloads, the limiting factor is cache space. So, if you can use 32 but values, you can get much better performance for those workloads.

stochastic_monk 2900 days ago

And if you’re doing heavy floating point work, you can fit twice as many operations in with a 32-bit float vector as an equally sized double vector, and The vectorized operations happen roughly as fast for both forms, yielding an approximate doubling of speed.

dnautics 2900 days ago

for rank-2 tensor work you can do 4x as many operations, for rank-3 tensor work, it's 8x, assuming memory bandwidth is the bottleneck.

stochastic_monk 2900 days ago

Does that mean it’s 64x as fast for 16-bit floating point vs 64-bit for a rank 3 tensor?

dnautics 2900 days ago

assuming 1) memory bandwidth is the bottleneck and 2) you can keep the tensor values in cache or registers.

I think that GPUs are still vector processing engines, so they should scale with 4x... But assuming google architected the TPU correctly, it should be 16x as fast (I think the architecture is actually that of a rank-2 tensor).

egocentric 2900 days ago

This "nobody needs a 32-bit anything in 2018" seems like a weird opposite of "640K should be enough for anyone".

https://www.wired.com/1997/01/did-gates-really-say-640k-is-e...

wyldfire 2900 days ago

Lots of ML and AI applications are using ever-smaller precisions. Half and even quarter-precision floats are able to maximize efficiency of the various CPU/GPU ALUs.

cbsmith 2900 days ago

I was going to mention that... Just because we have ridiculous transistor budgets don't mean there aren't problems where you need/want to push the envelope for performance instead of precision. If anything, it grows the applicable problem space.

vardump 2900 days ago

> That's great and all, but nobody needs a 32-bit anything in 2018.

Then why x86-64 integer instructions default to 32-bit register size when REX prefix byte is not present?

You can double x86 FP throughput using 32-bit floats versus 64 bit ones.

For GPUs, the performance 32-bit float performance advantage can be more than 4-10x (sometimes a lot more).

dleslie 2900 days ago

TIL, no one in the gaming industry uses 32 bit floats any longer. /s

21 2899 days ago

Funny, in 2018 a lot of people are asking for 16-bit floats.

https://en.wikipedia.org/wiki/Half-precision_floating-point_...

bananaboy 2900 days ago

This is not true. In games 32-bit floats are extremely common.

nightcracker 2900 days ago

Nobody needs absolutes in 2018.

dnautics 2900 days ago

Even scientific calculation would be fine with 32 bit floats, but average floating point error due to representation creeps with ON (iirc) over N multiplications, so you have to use 64 bit for many scientific applications to get satisfactory results after a million or a trillion multiplications.

llukas 2900 days ago

Not really - https://en.wikipedia.org/wiki/Numerical_stability

If your algorithm is not stable then even 64-bit won't help you.

Compare Euler vs Verlet - https://en.wikipedia.org/wiki/Verlet_integration

dnautics 2900 days ago

You're making a different argument.

llukas 2900 days ago

Which problem that has stable algorithm would require 64-bit then?

toolslive 2900 days ago

What they typically do in 3d gaming is update the matrix that holds the transformation by a left multiplication, every time the camera changes. So

Tn = U_{n-1} * U_{n-2} * .... * U_0 * T_0

After a while,your matrix accumulates errors, but it's easy to just start and take a fresh one.

whyever 2899 days ago

> Even scientific calculation would be fine with 32 bit floats

It really depends on the algorithms in question and the error tolerances.

perfmode 2900 days ago

deep learning uses low precision floats

sometimes as few as 8 bits are needed

jadedhacker 2899 days ago

I think gen 1 or gen 2 of the TPU explicitly supported short ints.

oppositelock 2900 days ago

Realtime 3D still uses floats, but only when we can afford something so big, s10e5 is better where available.

minton 2900 days ago

I think this article is from 2010.

duckerude 2899 days ago

I wanted to put over a billion floats in a numpy array just a few months ago. Making them 16-bit saved a lot of memory.

It doesn't matter how much resource limits increase, people are going to keep hitting them. And when they hit them, using a smaller data type will always help.

jacquesm 2899 days ago

That's not relevant there are plenty of single precision float applications today (and many fixed point applications as well). It all depends on your workload.

Bromskloss 2900 days ago

> nobody needs a 32-bit anything in 2018

Tell us more about this strange "2018" place!