Hacker News new | ask | show | jobs
by TheArcane 508 days ago
I'm confused as to how you haven't found R1 to be much better. My experience has been exactly like that of the OP's
1 comments

What type of prompts were you feeding it? My limited understanding is that reasoning models will outperform LLMs like GPT-4/Claude at certain tasks but not others. Prompts that have answers that are more fuzzy and less deterministic (ie. soft sciences) will see reasoning models underperform because their training revolves around RL with rewards.