Hacker News new | ask | show | jobs
by lukecameron 1208 days ago
> 100× energy-efficiency advantage for running some of the largest current Transformer models, and that if both the models and the optical hardware are scaled to the quadrillion-parameter regime, optical computers could have a >8,000×

Maybe I interpreted that incorrectly but I thought it's saying a 100x advantage for current large Transformer models, and 8000x advantage for future quadrillion-parameter models? I didn't include those because I suppose that size of model is quite a few years away. Admittedly this is only based on the abstract...