Y
Hacker News
new
|
ask
|
show
|
jobs
by
p1esk
1930 days ago
They did all that before:
https://arxiv.org/abs/2101.06840
, but they could only fit a model with 13B weights on a single V100.