Hacker News new | ask | show | jobs
by convolvatron 614 days ago
yes. and the next question is 'ok, how do we add'
2 comments

Yes. I haven't yet read this paper to see what exactly it says is new, but I've definitely seen log-based representations under development before now. (More log-based than the regular floating-point exponent, that is. I don't actually know the argument behind the exponent-and-mantissa form that's been pretty much universal even before IEEE754, other than that it mimics decimal scientific notation.)
I guess that if the bulk of the computation goes into the multiplications, you can work in the log-space and simply sum, and when the time comes to actually do a sum on the original space you can go back and sum.
Not sure how well that would work if you're often adding bias after every layer