| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tsimionescu 262 days ago
	The point of FOSS is control. You want to have access to the source, including build instructions and everything, in order to be able to meaningfully change the program, and understand what it actually does (or pay an expert to do this for you). You also want to make sure that the company that made this doesn't have a monopoly on fixing it for you, so that they can't ask you for exorbitant sums to address an issue you have. An open weight model addresses the second part of THIS, but not the first. However, even an open weight model with all of the training data available doesn't fix the first problem. Even if you somehow got access to enough hardware to train your own GPT-5 based on the published data, you still couldn't meaningfully fix an issue you have with it, not even if you hired Ilya Sutskever and Yann LeCun to do it for you: these are black boxes that no one can actually understand at the level of a program or device.

2 comments

guy_5676 262 days ago

I'm not an expert on this tech, so I could be talking out my ass, but what you are saying here doesn't ring completely true to me. I'm an avid consumer of stable-diffusion based models. The community is very easily able to train adaptations to the network that push it in a certain direction, to the point you consistently get the model to produce specific types of output (e.g. perfectly replicating the style of a well known artist).

I have also seen people train "jailbreaks" of popular open source LLMs (e.g. Google Gemma) that remove the condescending ethical guidelines and just let you talk to the thing normally.

So all in all I am skeptical of the claim that there would be no value in having access to the training data. Clearly there is some ability to steer the direction of the output these models produce.

fragmede 261 days ago

Golden Gate Claude, and abliterated models, plus Deepseek's censoring of Tianamen Square, combined with Grok's alternate political views imply that these boxes are somewhat translucent, especially to leading experts like Ilya Sutskever. In order for Grok to hold alternative views, and to produce NSFW dialog while ChatGPT refuses to implies that there's additional work that happens during training to align models. Getting access to the source used to train the models would let us see into that model's alignment. It's easy enough to ask ChatGPT how to make cocaine, and get a refusal, but what else is lying in wait that hasn't been discovered yet? It's hard to square the notion that these are black boxes that no understands whatsoever, when the original LLama models, which also contain the same refusal, have been edited, at the level of a program, into abliterated models which happily give you a recipe. Note: I am not Pablo Escobar and cannot comment on the veracity of said recipe, only that it no longer refuses.

https://www.anthropic.com/news/golden-gate-claude

https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-a...