Can the LLM have a runloop? Can the LLM be situated in a world like you and me are?
If the LLM is just a file on a hard disk in a drawer not connected to anything, then obviously it can't discover novel tokens on its own.
If on the other hand the LLM has a runloop and sensors and basic instructions to do observations and run thought experiments and find new combinations of concepts and name them with tokens, then sure, why wouldn't it be able to?
You might say you define LLMs as "LLMs as they exist today in a human prompt-driven system" but that would be an artificial limitation given the trivial level of programming, even simple bash scripting, that would be necessary to give an LLM a runloop, access to sensors, and basic instructions to discover new stuff.
Can you make a novel sound? One that's not part of any human language?
Perhaps you can, using a tool. However, if we're allowing tools, I bet GPT4 could also write a program that would produce a novel token, by whatever definition you might give.
I don't think GPT4 is AGI. But this is not a good test. (And it does mean something that coming up with a good test is increasingly nontrivial.)
Can the LLM have a runloop? Can the LLM be situated in a world like you and me are?
If the LLM is just a file on a hard disk in a drawer not connected to anything, then obviously it can't discover novel tokens on its own.
If on the other hand the LLM has a runloop and sensors and basic instructions to do observations and run thought experiments and find new combinations of concepts and name them with tokens, then sure, why wouldn't it be able to?
You might say you define LLMs as "LLMs as they exist today in a human prompt-driven system" but that would be an artificial limitation given the trivial level of programming, even simple bash scripting, that would be necessary to give an LLM a runloop, access to sensors, and basic instructions to discover new stuff.