Hacker News new | ask | show | jobs
by rao-v 108 days ago
Agree re: progressive growing

In terms of sub structure - in the old days of Core Wars randomly scattering bits of code that did things could pay off. I’m imagining similar things for LLMs - just set 10% of weights as specific known structures and watch to see which are retained / utilized by models and which get treated like random init