|
|
|
|
|
by jokethrowaway
536 days ago
|
|
What can you learn from something parroting data we already have? Similarly, we are now finding that training on synthetic data is not helpful. What would have happened if we invested 1/100 of what we spent on LLM on the rule based approach? |
|
This has been tried repeatedly many times before, and so far there has been no indication of a breakthrough.
The fundamental problem is that we don't know the actual rules. We have some theories, but no coherent "unified theory of language" that actually works. Chomsky in particular is notorious for some very strongly held views that have been lacking supporting evidence for a while.
With LLMs, we're solving this problem by bruteforcing it, making the LLMs learn those universal structures by throwing a lot of data at a sufficiently large neural net.