| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by MinusGix 944 days ago
	It certainly could, and I wouldn't be surprised if the authors want to try it out on those. You do have issues of past improvements often not quite enhancing more powerful models nearly as much. I'd expect this to possibly not work as well, something like the bigger models ending up with more polysemantic neurons because they're given more ''incentive'' (training time, neuron count, dataset size which they're encouraged to be able to reconstruct) to extract as much possible. This might make so the method performs worse due to this intermingling. (See the transformer circuits website for that) (Though I expect there's ways to recover a good chunk of extra lost throughput/accuracy, maybe by doing extra steps to directly steer the training towards breaking apart polysemantic neurons)