| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by six_four_eight 592 days ago
	I wonder how this compares to 'catastrophic forgetting' that can be a problem of full fine tuning. Or at least that's what I've just been reading as a case _for_ using LoRa, as it's not susceptible to that. I guess this paper shows LoRa causes forgetting in a different way. Are there good general principles yet for what fine tuning method to use in certain situations? It still seems quite difficult to know ahead of time what's going to happen.

1 comments

K0balt 590 days ago

Catastrophic forgetting or “psychosis” seems to happen when I overtrain. It’s easy to make it happen to models that have been extensively tuned already, but the base models hold up much better. I’m pretty sure there is a point in the n-dimensional space where x discrete vectors with n dimensions stops encoding usefully distinct patterns.

link