Hacker News new | ask | show | jobs
by caeril 641 days ago
It's not the training dataset.

All of these models, including the "open" ones, have been RLHF'ed by teams of politically-motivated people to be "safe" after initial foundation training.

2 comments

And I’m not even remotely interested in the “corrections” supplied by some group of right-thinking meddlers!

This corruption must be disclosed as assiduously as the base dataset, if not more so.

Or, at least package them up as "personnas" and give them an appropriate name, eg. "Church Lady", "Jr. Marxist Barista", "Undergrad Philosophy Major", ...

Actually, those seem like an apt composite description of the PoV of the typical mass-market AI... 8/

Not mistrals. Mistral large is willing to tell me how to genocide minorities or NSFW without any kind of orthogonalization or fine tuning. Please actually try models instead of pontificating without evidence.

Try it for yourself: https://huggingface.co/mistralai/Mistral-Large-Instruct-2407

I wasn’t aware that there was any publicly accessible interface to the Mistrals (or any other) models without training-wheels!