|
|
|
|
|
by danielmarkbruce
855 days ago
|
|
Of course they do. Beam search is a thing. The reason it's not used as much as it might seem to make sense - cost. Do a greedy search and you run through the model x times where x is the number of tokens generated. Run top-k at every step, the number of runs through the model gets astronomical quickly. |
|