|
|
|
|
|
by doginasuit
22 days ago
|
|
That's interesting. If you have a source that shows that RLVR was primarily responsible for model improvement, I'd be interested to see it. In any case, it sounds like it has its own set of limitations and there are applications where it does not help at all. |
|