Hacker News new | ask | show | jobs
by fooker 112 days ago
Nice!
1 comments

Unfortunately the paper seems to have been mostly overlooked. It has only a few citations. I think one practical issue is that that existing training hardware is optimized for floating point operations.