| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by eoerl 1346 days ago
	It is not typically possible to blend models like that, since the training process is (lateral) order insensitive, as far as the model goes.

2 comments

liuliu 1346 days ago

I thought so too until found that there are quite a bit of literatures nowadays about "merging" weights, for example, this one: https://arxiv.org/pdf/1811.10515.pdf and also the OpenCLIP paper.

link

ShamelessC 1346 days ago

Is that still the case when all models have a common ancestor (i.e. finetuned) and haven’t yet overfit on new data?

link