Y
Hacker News
new
|
ask
|
show
|
jobs
by
marxplank
499 days ago
that would help with decidable problems but would still be not generalisable for problems with non trivial rewards, or ones with none.
1 comments
astrange
499 days ago
Reasoning seems to generalize, insofar as o1 and DeepSeek-R1 are better at answering questions than their base models.
link