| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gordonhart 118 days ago
	Anecdotally, 4o's sycophancy was higher than any other model I've used. It was aggressively "chat-tuned" to say what it thought the user wanted to hear. The latest crop of frontier models from OpenAI and others seems to have significantly improved on this front — does anybody know of a sycophancy benchmark attempting to quantify this?

1 comments

co_king_3 118 days ago

If I worked at OpenAI, I would dial up the sycophancy to lock my users in right before raising subscription prices.

link

gordonhart 118 days ago

That's... a strategy. Matter of time before an AI companion company succeeds with this by finetuning one of the open-source offerings. Cynically I'm sure there are at least a few VC backed startups already trying this

link

co_king_3 118 days ago

Cynically I think Anthropic is on the bleeding edge of this sort of fine-tuned manipulation.

Also If I worked for one of these firms I would ensure that executives and people with elevated status receive higher quality/more expensive inference than the peons. Impress the bosses to keep the big contracts rolling in, and then cheap out on the day-to-day.

link