|
|
|
|
|
by tsimionescu
262 days ago
|
|
The point of FOSS is control. You want to have access to the source, including build instructions and everything, in order to be able to meaningfully change the program, and understand what it actually does (or pay an expert to do this for you). You also want to make sure that the company that made this doesn't have a monopoly on fixing it for you, so that they can't ask you for exorbitant sums to address an issue you have. An open weight model addresses the second part of THIS, but not the first. However, even an open weight model with all of the training data available doesn't fix the first problem. Even if you somehow got access to enough hardware to train your own GPT-5 based on the published data, you still couldn't meaningfully fix an issue you have with it, not even if you hired Ilya Sutskever and Yann LeCun to do it for you: these are black boxes that no one can actually understand at the level of a program or device. |
|
I have also seen people train "jailbreaks" of popular open source LLMs (e.g. Google Gemma) that remove the condescending ethical guidelines and just let you talk to the thing normally.
So all in all I am skeptical of the claim that there would be no value in having access to the training data. Clearly there is some ability to steer the direction of the output these models produce.