|
|
|
|
|
by not_a_toaster
283 days ago
|
|
Seems interesting to let the LLMs design their own reasoning traces instead of being constrained by human labelers. I could imagine some self consistency approaches to find common high-quality reasoning traces. Seems like a bitter lesson moment for reasoning traces. PDF of the Paper - https://arxiv.org/pdf/2509.06160 |
|