Hacker News new | ask | show | jobs
by lumost 835 days ago
It did make me curious however, if we dropped the requirement that operations return correct values in favor of probably correct values - would we see any material computing gains in hardware? Large neural models are intrinsically error correcting and stochastic.

I’m unfortunately not familiar enough with hardware to weigh in.

1 comments

The trouble is if you use actual randomness then you lose repeatability which is an incredibly useful property of computers. Have fun debugging that!

What you want is low precision with stochastic rounding. Graphcore's IPUs have that and it's a really great feature. It lets you use really low precision number formats but effectively "dithers" the error. Same thing as dithering images or noise shaping audio.

Yeah debugging would be a pain, but in the context of inference/training unnecessary. There is some set of ops which requires high precision, if I L2 normalize a tensor - I really need it to be normalized. But matmul/addition? Maybe there is wiggle room.

Big challenge would be whether any gains could compete with the economy of scale from NVidia.