Hacker News new | ask | show | jobs
by nwoli 751 days ago
Scaling linear algebra in the end is probably all we’ll need in the end. Only missing data and compute to get there
1 comments

Memory capacity is a much bigger problem. Mixtral 8x22B is a 200GB+ model.