Y
Hacker News
new
|
ask
|
show
|
jobs
by
d0mine
5 days ago
As I understand RL makes foundation models stupider (
less capable
, not more) but better at following instructions.