| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jncfhnb 888 days ago
	No it’s not. You have everything you need to modify the models to your own liking. You can explore how it works. This analogy is bad. Models are unlike code bases in this way.

2 comments

tonyarkles 888 days ago

> You have everything you need to modify the models to your own liking.

What if I wanted to train it using only half of its training set? If the inputs that were used to generate the set of released weights are not available I can’t do that. I have a set of weights and the model structure but without the training dataset I have no way of doing that.

To riff on the parent post, I have:

    Source + Compiler => Binaries

For the vast majority of open source models I have:

    [unavailable inputs] + Model Structure => Weights

They’re not exactly the same as the source code/binary scenario because I can still do this (which isn’t generally possible with binaries):

    Model Structure + Weights + [my own training data] => New Weights

Another way to look at it is that with source code I can modify the code and recompile it from scratch. Maybe I think the model author should have used a deeper CNN layer in the middle of the model. Without the inputs I can’t do a comparison.

link

jncfhnb 888 days ago

> Maybe I think the model author should have used a deeper CNN layer in the middle of the model. Without the inputs I can’t do a comparison.

You can fine tune into a different model architecture.

You’re right on not being able to retrain the model from scratch on half its data without that data but that’s likely pointless.

link

tonyarkles 888 days ago

I’d be happy to be wrong about this but my understanding is that changing the architecture of the last few layers is feasible with fine-tuning but changing middle layers isn’t likely going to work very well without having the full original input set.

> likely pointless

It doesn’t take too much creativity to come up with ideas about why someone might want to do that:

- researchers who want to investigate how much the dataset can be reduced (and thus training cost) and what the accuracy penalty is

- someone who wants to for either religious or ethical reasons minimize the probability that the model was trained on pornography

- someone who’s curious about whether there’s significant redundancy in the existing input datasets

- someone who’s curious about whether there are a much smaller subset of images in the input dataset that can quickly help the first few CNN input layers converge before training the middle and output layers on the larger dataset.

Edit: I suspect the real reason they don’t want to share the input dataset is purely because a high-quality annotated dataset is a valuable commodity. While I don’t do ML work myself day-to-day, I do work with a team that does in a very niche field and I can only imagine how much effort they had to go through to get the annotated dataset that they’ve put together. Even just collecting the images for it involved many hours of drone flights in different locales around North America in varying weather and lighting.

link

jncfhnb 888 days ago

Original input set is irrelevant.

You will need some data of your own of course to fill in the blanks

Edit; however conversely, you can also splice out layers from one model into another original model. It’ll take some retraining, but this works!

link

anothernewdude 888 days ago

You can do the same with binaries. Can modify those all you want.

Models are the compiler + makefiles. Dataset is the code.

link

hn_acker 887 days ago

I don't know about the OSI's open source definition [1] in general, but specific licenses might consider makefiles and build scripts to be part of the source code. (For what it's worth, the free software definition from the FSF does consider makefiles and build scripts to be part of the source code [2].)

[1] https://opensource.org/osd/

[2] https://www.gnu.org/philosophy/free-sw.html

link

jncfhnb 888 days ago

No, it’s not the same. Yes, you can technically modify binaries, but it’s not at all the preferred way to modify the program.

link

anothernewdude 887 days ago

Congratulations. You've almost finished understanding my comment.

link

jncfhnb 887 days ago

Well you’ve failed and managed to be a dick

link