Hacker News new | ask | show | jobs
by armanboyaci 1616 days ago
> Other learning paradigms are about minimization; reinforcement learning is about maximization.

I don't see why this is important.

3 comments

I think they wanted to express that learning to predict the correct output ("error minimization") puts a limit on the achievable performance. While ranking (not just RL, really) allows to improve beyond the current best-known answer.
Also the next point

> It should have (and has shown to have) better scaling laws

is a statement based on two anecdotes but I don't see a compelling reason why this should be the case in general.

Active learning approaches are not mentioned even though they allow incorporating human feedback during the fine-tuning process and this can be done with a purely supervised approach.

IMO the last point is the only compelling one : having for example agents that can browse the web during learning could open a lot of possibilities. It would have been interesting to develop this last point more : what are the current difficulties in training such agents?

It's important in the context that RL does not have performance ceilings.