| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by thevinter 7 days ago
	And what happens once the "solid baselines" become unavailable for a reason or the other?

2 comments

zozbot234 7 days ago

You keep building on the last available version? Fine tuning is a whole lot cheaper, easier and more useful than pretraining a model from scratch. It's a complete no brainer.

link

rapidfl 7 days ago

> You keep building on the last available version?

yes but a sovereign can allocate some resources and a few people to stay in the loop from a first principles level. No need to wait for a rug pull.

Of course, it can not compete with the frontier labs. But good to have researchers and professors "in-house". LLMs are here for the long-term.

link

GTP 6 days ago

> But good to have researchers and professors "in-house".

I'm not in this field, but I think we already have them. Probably the main difference is that we have most or all of them in academia and next to none insode private companies. But we do have them, and they could start working for private companies if the market moves in that direction in the EU as well

link

michaelscott 7 days ago

Unfortunately in this game first principles requires massive resources, not "some". Building in-house on top of existing open weights is a good way to bootstrap this process, especially since there's nothing inherently magical or particularly expertise-heavy when it comes to weights themselves

link

ozim 7 days ago

Seems like you don’t understand.

You take current version and build on top of it. You have the weights.

You might not get some n+1 version at some point but the n version you will have will be still most likely much better than whatever you come up with burning good will money of people believing in „sovereignty”.

You are not getting ahead in this game by being „true to your local values” capital expenditure is insane in this game.

link

oneshtein 6 days ago

It seems like you don't understand. For fine tuning, it's cheaper to fine tune an existing model. For massive changes, it's better to retrain from scratch. Otherwise, model will UNLEARN a lot first, and then you will train about twice longer to the same result.

https://en.wikipedia.org/wiki/Catastrophic_interference

link

ozim 4 days ago

Ok this „you don’t understand” was uncalled for, I am sorry.

I am speaking from very practical point of view. English and whatever frontier models are trained on is lingua Franca of software/tech/science currently - you don’t want to make massive changes because just exactly like you wrote model will unlearn a lot.

Current models translate easily between the languages.

So from my point of view even if we as a smaller country will have N-2 model that we slightly fine tune or just give it a harness with national RAG it will be better than wasting money on training a model from scratch only on „texts in country own language „ because that is loosing proposition. It’s usefulness is going to be really limited compared to model trained on body of knowledge in English.

It is a lot like company CEO thinking they have to „train AI model” for their company on their company materials - well no, you just make RAG and eventually fine tune some models, give access to dat, give access to MCP.

Because even if you are F1000 company you don’t have resources to train your own generic model, as a nation there is no that have resources to train „national model” on par with frontier labs.

link