|
|
|
|
|
by sigmaisaletter
392 days ago
|
|
What I am wondering about is - while Musk is as unsubtle as ever, and I guess this is a system prompt instruction - is there something like that (in more subtle ways) going on in the other big models? I don't mean big agenda-pushing things like Musk, but what keeps e.g. Meta Inc. from training Llama to be ever so slightly more friendly and sympathetic to Meta Inc, or the tech industry in general? Even an open-weights model can't be easily inspected, so this is likely to remain undetected. |
|
Even if there were something the natural incentive alignment is going to cause the AI to be trained to match what the company thinks is ok.
A tech company full of techies is not going to take an AI trained to the point of saying things like "y'all are evil, your company is evil, your industry is evil" and push it to prod.