Hacker News new | ask | show | jobs
by amrb 662 days ago
It's a red flag that the 1.2bil model has to fit in gpu memory, happy to be provided wrong when the code drops
1 comments

That's not something that DisTrO solves, but there's plenty of research in that area! See https://arxiv.org/abs/2301.11913 , https://arxiv.org/abs/2206.01288 , https://arxiv.org/abs/2304.11277 etc :)