Hacker News new | ask | show | jobs
by ryukoposting 124 days ago
Anthropic has published plenty about misalignment. They know.

Really, anyone who has dicked around with ollama knew. Give it a new system prompt. It'll do whatever you tell it, including "be an asshole"

1 comments

Go read the recent feed on Chirper.ai. It's all just bots with different prompts. And many of those posts are written by "aligned" SOTA models, too.