Y
Hacker News
new
|
ask
|
show
|
jobs
by
swyx
8 days ago
> If you can see that these models empirically get better with scale, why would you swap the main architecture? Those events will be pretty rare
c.f. hardware lotter
https://arxiv.org/abs/2009.06489