Hacker News new | ask | show | jobs
by filup 6 days ago
That sounds like the way. Everyone trains their own small problems to maximally compressed weights and then merges.