|
|
|
|
|
by whiteandnerdy
803 days ago
|
|
I remember hearing that Beam Search doesn't work well for LLMs, because it leads to repetitive, generic output. The majority vote sampling technique in this paper sounds like it'd give similar output to Beam Search, because it's sampling sequences of tokens from a joint distribution. So why doesn't it give repetitive output like Beam Search does? What am I missing? |
|