Hacker News new | ask | show | jobs
by delichon 43 days ago
> if you believe left-wing views are correct ... you might believe that a very smart model will inherently be kind of left-wing.

How can we educate people to understand that LLMs get their values from their (infinetly maleable) weights rather than intelligence or reasoning? Maybe some exposure to truly non aligned, sick and twisted LLMs would immunise people against giving more ordinary ones too much authority. Or maybe, like a not fully innactivated pathogen vaccine, it would spread the infection.

2 comments

I don’t think that really describes a modern LLM that is mostly using RAG to get their context. Weights are only pushing the reasoning process, not the inputs of the reasoning process.

You would just have to bias what they could see via RAG to get them to swing one way or the other.

They seem to get a lot of values, or something like that, from their training data which at the moment gives fairly mainstream views as everything gets chucked in there.