| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by johndhi 90 days ago

The article posits that sycophancy is inherent to how models are trained.

I think there's a simpler explanation. Every leaked system prompt from every model pretty much includes instructions to "be helpful," and the models are trained to be assistants, not just general knowledge repositories or research tools.

My hunch is that's the core of the problem -- the system prompt.

1 comments

Eddy_Viscosity2 90 days ago

My prompts always contain the phrase 'no sycophancy'. The results are more direct.

link

fleischhauf 90 days ago

I wonder what happens if you prompt it to be a tool and not an assistant and that it does not need to be helpful just do as instructed or something like this

link