Hacker News new | ask | show | jobs
by lolinder 607 days ago
People still chose to compile emacs from scratch rather than modify the binary. The source code was the preferred form for modifications.

The same is not true of these models. To my knowledge no company has retrained a model from scratch to make a modification to it. They make new models, but these are fundamentally different works with different parameter counts and architectures. When they want to improve on a model that they already built, they fine tune the weights.

If that's what companies that own all the IP do, that tells me that the weights themselves are the preferred form for making modifications, which makes them source code under the gpl's definition.

1 comments

The problem with hype around LLM is that people without much experience in the field can't think of anything else.

So much they forget the basics of the discipline.

What do you think cross validation is for?

To compare different weights obtained from different initializations, different topologies, different hyper-parameters... all trained from the same training dataset.

Even for LLM, have you ever tried to reduce the size of the vocabulary of, say, Llama?

No?

Yet it's a totally reasonable modification.

What's the preferred form to make modifications like this?

Can you do it fine tuning llama weights?

No.

You need training data.

That's why training data are the preferred form to make modification, because whatever the AI (hyped or not) it's the only form that let you make all modifications you want.