Y
Hacker News
new
|
ask
|
show
|
jobs
by
rosstaylor90
487 days ago
RL has more than two steps...
1 comments
bossyTeacher
486 days ago
Point is that reasoning is more about the conclusions. if your steps are wrong, your reasoning is wrong regardless of the conclusion. Poor reasoning is what could make an LLM conclude that 1 + 2 = 3 but what 2 + 1 = [some number other than 3]
link