|
|
|
|
|
by dag100
77 days ago
|
|
This is basically a violation of the robustness principle ("be liberal in what you accept, be conservative in what you produce"), but I doubt there will be much improvement on this front seeing as tokens are fed back into the model. A succinct phrase is a compressed form of a longer sentence that expresses the same idea, so from the perspective of having to feed the model's output back into it, more tokens presumably work better by providing a greater of surface area for processing, so to speak. This is just my intuition, however. |
|