Hacker News new | ask | show | jobs
by ozim 4 days ago
Seems like you don’t understand.

You take current version and build on top of it. You have the weights.

You might not get some n+1 version at some point but the n version you will have will be still most likely much better than whatever you come up with burning good will money of people believing in „sovereignty”.

You are not getting ahead in this game by being „true to your local values” capital expenditure is insane in this game.

1 comments

It seems like you don't understand. For fine tuning, it's cheaper to fine tune an existing model. For massive changes, it's better to retrain from scratch. Otherwise, model will UNLEARN a lot first, and then you will train about twice longer to the same result.

https://en.wikipedia.org/wiki/Catastrophic_interference

Ok this „you don’t understand” was uncalled for, I am sorry.

I am speaking from very practical point of view. English and whatever frontier models are trained on is lingua Franca of software/tech/science currently - you don’t want to make massive changes because just exactly like you wrote model will unlearn a lot.

Current models translate easily between the languages.

So from my point of view even if we as a smaller country will have N-2 model that we slightly fine tune or just give it a harness with national RAG it will be better than wasting money on training a model from scratch only on „texts in country own language „ because that is loosing proposition. It’s usefulness is going to be really limited compared to model trained on body of knowledge in English.

It is a lot like company CEO thinking they have to „train AI model” for their company on their company materials - well no, you just make RAG and eventually fine tune some models, give access to dat, give access to MCP.

Because even if you are F1000 company you don’t have resources to train your own generic model, as a nation there is no that have resources to train „national model” on par with frontier labs.