| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by behnamoh 529 days ago
	This was discussed in my paper last year: https://arxiv.org/abs/2406.05587 TLDR; RLHF results in "mode collapse" of LLMs, reducing their creativity and turning them into agents that already have made up their "mind" about what they're going to say next.

1 comments

Author here: Really interesting work. Updated original post to include link to the paper. Thanks!