|
|
|
|
|
by somenameforme
17 days ago
|
|
It poses a simple problem. Take humanity back not that long ago into the past and language didn't even exist - our expressed token base was practically 0. We went from that discovering the secrets of the atom, putting a man on the Moon, and more. If you put an LLM in that starting point, they're going to do nothing but endlessly cycle over basically nothing. If you give them an infinite amount of time and processing, that wouldn't change. This same issue simultaneously demonstrates how humans are not anything at all like token predictors. No matter how much time you spend remixing the tokens of primitive man, you don't get 'and here is how you land on the Moon' from it. |
|
It also outputs vectors that are coerced into tokens for human consumption.
Yes, it goes through tokens but possible internal meanings assigned to these tokens (when surrounded by other tokens) are infinite.
That's how humans form caves got to where we are now. By associating new meanings with the same old sound clumps.