|
|
|
|
|
by froobius
248 days ago
|
|
(Just to expand on that, it's true not just the for the first token. There's a lot of computation, including potentially planning ahead, before each token outputted.) That's why saying "it's just predicting the next word", is a misguided take. |
|