|
|
|
|
|
by sachaa
73 days ago
|
|
Fair points, especially on GSM8K saturation and Qwen possibly already sitting close to the solution. That said, even if this is mostly "last-mile alignment", the fact that it can be done with such a tiny signal is still interesting, it suggests the gap between capability and behavior might be much smaller (and cheaper to bridge) than we assume. |
|
Can you elaborate a bit on what you mean with the gap?