|
|
|
|
|
by nextos
366 days ago
|
|
I think he did understand both the significance of his work and the importance of hardware. His group pioneered porting models to GPUs. But personal circumstances matter a lot. He was stuck at IDSIA in Lugano, i.e. relatively small and not-so-well funded academia. He could have done much better in industry, with access to lots of funding, a bigger headcount, and serious infrastructure. Ultimately, models matter much less than infrastructure. Transformers are not that important, other architectures such as deep SSMs or xLSTM are able to achieve comparable results. |
|