Hacker News new | ask | show | jobs
by zora_goron 498 days ago
Does anyone know, how "reasoning effort" is implemented technically - does this involve differences in the pre-training, RL, or prompting phases (or all)?