| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by int_19h 632 days ago
	Saying "it's just autocomplete" is not really saying anything meaningful since it doesn't specify the complexity of completion. When completion is a correct answer to the question that requires logical reasoning, for example, "just autocomplete" needs to be able to do exactly that if it is to complete anything outside of its training set.

2 comments

HarHarVeryFunny 632 days ago

It's just a shorthand way of referring to how transformer-based LLMs work. It should go without saying that there are hundreds of layers of hierarchical representation, induction heads at work, etc, under the hood. However, with all that understood (and hopefully not needed to be explicitly stated every time anyone wants to talk about LLMs in a technical forum), at the end of the day they are just doing autocomplete - trying to mimic the training sources.

The only caveat to "just autocomplete" (which again hopefully does not need to be repeated every time we discuss them), is that they are very powerful pattern matchers, so all that transformer machinery under the hood is being used to determine what (deep, abstract) training data patterns the input pattern best matches for predictive purposes - exactly what pattern(s) it is that should be completed/predicted.

link

consteval 632 days ago

> question that requires logical reasoning

This is the tough part to tell - are there any such questions that exist that have not already been asked?

The reason Chat-GPT works is its scale. to me, that makes me question how "smart" it is. Even the most idiotic idiot could be pretty decent if he had access to the entire works of mankind and infinite memory. Doesn't matter if his IQ is 50, because you ask him something and he's probably seen it before.

How confident are we this is not just the case with LLMs?

link

HarHarVeryFunny 632 days ago

I'm highly confident that we haven't learnt every thing that can be learnt about the world, and that human intelligence, curiosity and creativity are still being used to make new scientific discoveries, create things that have never been seen before, and master new skills.

I'm highly confident that the "adjacent possible" of what is achievable/discoverable today, leveraging what we already know, is constantly changing.

I'm highly confident that AGI will never reach superhuman levels of creativity and discovery if we model it only on artifacts representing what humans have done in the past, rather than modelling it on human brains and what we'll be capable of achieving in the future.

link

int_19h 632 days ago

Of course there are such questions. When it comes to even simple puzzles, there are infinitely many permutations possible wrt how the pieces are arranged, for example - hell, you could generate such puzzles with a script. No amount of precanned training data can possibly cover all such combinations, meaning that the model has to learn how to apply the concepts that make solution possible (which includes things such as causality or spatial reasoning).

link

consteval 632 days ago

Right, but typically LLMs are really poor at this. I can come up with some arbitrary systems of equations for it to solve and odds are it will be wrong. Maybe even very wrong.

link

int_19h 627 days ago

That is more indicative of the quality of their reasoning than their ability to reason in principle, though. And maybe even quality of their reasoning specifically in this domain - e.g. it's not a secret that most major models are notoriously bad at tasks involving things like counting letters, but we also know that if you specifically train a model to do that, it does in fact drastically improve its performance.

On the whole I think it shouldn't be surprising that even top-of-the-line LLMs today can't reason as well as a human - they aren't anywhere near as complex as our brains. But if it is a question of quality rather than a fundamental disability, then larger models and better NN designs should be able to gradually push the envelope.

link