Hacker News new | ask | show | jobs
by marxplank 499 days ago
that would help with decidable problems but would still be not generalisable for problems with non trivial rewards, or ones with none.
1 comments

Reasoning seems to generalize, insofar as o1 and DeepSeek-R1 are better at answering questions than their base models.