Hacker News new | ask | show | jobs
by jokethrowaway 536 days ago
What can you learn from something parroting data we already have?

Similarly, we are now finding that training on synthetic data is not helpful.

What would have happened if we invested 1/100 of what we spent on LLM on the rule based approach?

2 comments

There is an old joke that AI researchers came up with several decades ago: "quality of results is inversely proportional to the number of linguists involved".

This has been tried repeatedly many times before, and so far there has been no indication of a breakthrough.

The fundamental problem is that we don't know the actual rules. We have some theories, but no coherent "unified theory of language" that actually works. Chomsky in particular is notorious for some very strongly held views that have been lacking supporting evidence for a while.

With LLMs, we're solving this problem by bruteforcing it, making the LLMs learn those universal structures by throwing a lot of data at a sufficiently large neural net.

> What can you learn from something parroting data we already have?

You can learn that a neural network with a simple learning algorithm can become proficient at language. This is counter to what people believed for many years. Those who worked on neural networks during that time were ridiculed. Now we have a working language software object based on learning, while the formal rules required to generate language are nowhere to be seen. This isn’t just a question of what will lead to AGI, it’s a question of understanding how the human brain likely works, which has always been the goal of people pioneering these approaches.