Hacker News new | ask | show | jobs
by aitchnyu 42 days ago
Tangential. I'm a newb, can you name the concept of partitioning weights so we dont need to load whole thing?
1 comments

Do you mean model sharding?