Hacker News new | ask | show | jobs
by f0e4c2f7 669 days ago
There have been some papers showing that RLHF makes models more palletable to use but reduces performance on evals and in other various ways.

I couldn't find the one I was looking for but this is one of them.

https://arxiv.org/abs/2310.06452

Edit:

This tweet also has a screenshot showing degraded evals from RLHF from base model.

https://x.com/KevinAFischer/status/1638706111443513346?t=0wK...