Hacker News new | ask | show | jobs
by jdub 27 days ago
Reinforcement learning perturbs the model such that the token prediction process (inference) tends towards the desired result.