Hacker News new | ask | show | jobs
by cubefox 111 days ago
Unfortunately the paper seems to have been mostly overlooked. It has only a few citations. I think one practical issue is that that existing training hardware is optimized for floating point operations.