|
|
|
|
|
by ndand
451 days ago
|
|
I understand it differently, LLMs predict distributions, not specific tokens. Then an algorithm, like beam search, is used to select the tokens. So, the LLM predicts somethings like, 1. ["a", "an", ...] 2. ["astronomer", "cosmologist", ...], where "an astronomer" is selected as the most likely result. |
|