|
|
|
|
|
by lukecameron
1208 days ago
|
|
> 100× energy-efficiency advantage for running some of the largest current Transformer models, and that if both the models and the optical hardware are scaled to the quadrillion-parameter regime, optical computers could have a >8,000× Maybe I interpreted that incorrectly but I thought it's saying a 100x advantage for current large Transformer models, and 8000x advantage for future quadrillion-parameter models? I didn't include those because I suppose that size of model is quite a few years away. Admittedly this is only based on the abstract... |
|