Hacker News new | ask | show | jobs
by mav3ri3k 701 days ago
I am not deep into llms so I ask this. From my understanding, their last model was open source but it was in a way that you can use them but the inner working were "hidden"/not transparent.

With the new model, I am seeing alot of how open source they are and can be build upon. Is it now completely open source or similar to their last models ?

2 comments

It's intrinsic to transformers that the inner workings are largely inscrutable. This is no different, but it does not mean they cannot be built upon.

Gradient descent works on these models just like the prior ones.

they give you the code and they give you the model it runs, and you can customise and redistribute both. It's all open source in that respect.

What people are complaining about (totally unreasonably in my view) is obviously Meta is not "open sourcing" all the training data, so nobody can retrain the model from scratch themselves. This argument to me is just silly. The whole point of these models is they distil pretraining on massive data sets you wouldn't have access to otherwise. If you insist on them releasing the data set, they will have to cut it down to 0.1% of the size and you will be getting what you had access to already in the first place.

That's not the only thing that people are complaining about. Even the code is not open source, despite being called open source.
They could release the code that gathers and curates the data. Give a reproducible system for getting the pre training data. And presumably they own the post training RLHF stuff so could open that.

Without those you're locked in to them in terms of licensing of future versions.