| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by okaram 973 days ago
	Notice you are creating your own arbitrary definition of 'truly open', which IMHO corresponds more with 'reproducible'. We already have a definition of open source. I don't see any reason to change it.

3 comments

TeMPOraL 973 days ago

Problem is, the literal/default definition of "open source" is meaningless/worthless in this context. It's the weights, training data and methodology that matter for those models - NOT the inference shell.

It's basically like giving people a binary program and calling it open source because the compiler and runtime used are open source.

link

jerpint 973 days ago

The weights are the inference and result of training. I can give you all the training details and you might not be able to reproduce what I did (google does this all the time). As a dev, I’d much rather an open model over an open recipe without weights. We can all agree having both is the best case scenario but having openly licensed weights is for me the bare minimum of open source

link

losteric 973 days ago

The inference runtime software is open, the weights are an opaque binary. Publishing the training data, hyperparameters, process, etc - that would make the whole thing "open source".

link

magicalhippo 973 days ago

The quake engine is still open source even though it doesn't come with the quake game assets, no?

It seems unreasonable to require the training data just to be called open source, given it has similar copyright challenges as game assets.

Of course, this wouldn't make the model reproducible. But that's different from open source.

link

darkwater 973 days ago

Good example. And in fact you are calling the "engine" opensource, not the whole Quake game. The 'assets" in most "opensource" AI models are not available.

link

EGreg 973 days ago

Imagine if the Telegram client was open source but not the backend.

Imagine if Facebook open-sourced their front-end libraries like React but not the back-end.

Imagine if Twitter or Google didn’t publish its Algorithm for how they rank things to display to different people.

You don’t need to imagine. That’s exactly what’s happening! Would you call them open source because their front end is open source? Could you host your own back end on your choice of computers?

No. That’s why I even started https://qbix.com/platform

link

darkwater 973 days ago

I completely agree with you (and the example you mention are singled out in the "antifeatures" list in F-Droid, to name an example)

link

torginus 973 days ago

It's a bit different - here most of the value lies in the weights.

A better analogy would be some graphics card drivers which ship a massive proprietary GPU firmware blob, and a small(ish) kernel shim to talk with said blob.

link

magicalhippo 973 days ago

Well perhaps we can consider this a kind of short-sightedness of Stallmann. His point with GPL and the free software movement, as I understand it, was to ensure the user could continue to use the software regardless of what the software author decided to do.

Sometimes though the software alone can be near useless without additional assets that aren't necessarily covered by the code license.

Like Quake, having the engine without the assets is useless if what you wanted was to play Quake the game. Neural nets are another prime example, as you mention. Simulators that rely on measured material property databases for usable results also fall into this category, and so on.

So perhaps what we need is new open source licenses that includes the assets needed for the user to be able to reasonably use the program as a whole.

link

ekianjo 973 days ago

Weights are like binaries. They are not code. It would make more sense to put it under a creative commons license

link