|
|
|
|
|
by kcarnold
519 days ago
|
|
This was the subject of https://arxiv.org/abs/2412.03719. (I suspect you can do simpler than the paper's solution if you're only interested in the top-k.) A related topic is "token healing", although some implementations (unfortunately including the one in HuggingFace Transformers) make some big assumptions that aren't always true (like treating spaces as special). |
|