Y
Hacker News
new
|
ask
|
show
|
jobs
by
amrb
662 days ago
It's a red flag that the 1.2bil model has to fit in gpu memory, happy to be provided wrong when the code drops
1 comments
arilotter
662 days ago
That's not something that DisTrO solves, but there's plenty of research in that area! See
https://arxiv.org/abs/2301.11913
,
https://arxiv.org/abs/2206.01288
,
https://arxiv.org/abs/2304.11277
etc :)
link