| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by adam_arthur 1189 days ago

Approximating continuous functions is likely quite the same as what people do too. You think there isn’t some mathematical model under the hood of how the brain works too? That it doesn’t break down into functions with interpretable results? Is it spiritual or mystical in your mind?

These takes are so bad and pervasive on here, honestly. This is what I mean by grandiose thinking.

A machine that approximates functions, that otherwise is indistinguishable from human, is effectively intelligent like a human. Incentives, wants, desires, and the ability to conduct our own training is the only difference at that point.

2 comments

YeGoblynQueenne 1189 days ago

>> Is it spiritual or mystical in your mind?

No, they're just saying there are continuous functions, and then there are discrete functions, and neural nets can't approximate discrete functions, while humans certainly can (e.g. integer addition). And that even when it comes to approximating any continuous function, neural nets can do that in principle, but we don't know how to do it in practice, just like we know time travel, stable wormholes and the Alcubierre drive are feasible in principle, but we can't realise them in practice.

So please don't say it's "spiritual and mystical" in the other person's mind just because it's not very clear in yours.

Also, what the OP didn't say is that a Transformer architecture is not the kind of architecture used to show the universality of neural nets. That was shown for a multi-layer perceptron (MLP) with one hidden layer, not a deep neural net like a Tansformer, and certainly not a network with attention heads. If you wanted to be all theoretical about it and claim that because there's that old proof, someone will eventually find out how to do it in practice, then the Transformer architecture has already taken a wrong turn and is moving away from the target.

There aren't no universality results for Transformers. I mean, that would be the day! The reason that that proof was derived for a MLP with one hidden layer is that this makes the proof much, much easier, than if you wanted to show the same for another architecture.

link

adam_arthur 1189 days ago

I can ask an LLM what 2+2 is and it can answer with 4. That's a discrete result. So how is this different from human thinking? Where is your evidence that this is not a similar mechanism?

It gets some math wrong because it doesn't understand the "systemic" aspect of math, but who's to say that with minor training tweaks, or a larger dataset, it wouldn't be able to infer the system? Humans infer systems from language all the time. To say you need some specialized form of training beyond language inference is obviously wrong when you view how humans train, learn and understand. All of life is ingestion of information via language which produces systemic understanding.

I can play digital audio that's indistinguishable from acoustic, despite it not being a smooth function in practice. Similarly, a sufficiently advanced neural net can produce intellect-like results, even if there are aspects of the structure you say may not make it so.

Honestly, the perception you and many others seem to hold is that because something is mathematically explainable in such a way that you can "trivialize" its operation, makes it not intelligence. But you hold "intelligence" in too high a regard

link

YeGoblynQueenne 1189 days ago

>> I can ask an LLM what 2+2 is and it can answer with 4. That's a discrete result. So how is this different from human thinking? Where is your evidence that this is not a similar mechanism?

A language model can match "2+2" with "4" because it's approximating the distribution of token collocations in a large text corpus, not because it's approximating integer addition.

We know this because we know that language models are trained on token collocations (word embeddings) and not arithmetic functions. We know how language models are trained because we know how they're made, because they're made by humans and they're made following principles that are widely shared in academic textbooks and scholarly articles all over the place.

>> Humans infer systems from language all the time.

Humans are not neural nets, and neural nets are not humans. Does that suffice? I don't know if I can do any better than that. Humans do human things, neural nets do neural net things, and humans can do things that neural nets can't even get close to. Like, dunno, inventing arithmetic? Or axiomatizing it? Or proving that its axiomatization is incomplete. That sort of stuff. Things for which there are no training examples, not of their instances, but of their entire concept class.

>> But you hold "intelligence" in too high a regard

Where does that stuff come from, I wonder? Of course I hold intelligence in high regard. What do you hold in high regard, stupidity?

link

tsimionescu 1188 days ago

> Approximating continuous functions is likely quite the same as what people do too.

In a very broad sense, if you just mean "the human brain also just approximates some class of functions", sure. However, human brains can surely represent many classes of non-continuous functions as well (tan, lots of piece wise functions, etc). And, crucially, some of these are necessary for our physical models of the world. So, if neural networks are limited to only representing continuous functions, that is a strong indication that they are fundamentally unable to mimic the human mind.

> You think there isn’t some mathematical model under the hood of how the brain works too?

Of course it does. I do believe that the mind is simply a program running on the physical computer that is our brain. And I am sure that some day we will be able to create an AI that is human-like, and probably much better at it, running on silicone.

That doesn't mean that we should believe every program running on silicone, despite somewhat obvious fundamental limitations, is going to be the next AGI any day now. That's all I'm trying to point out: neural networks are not a great model for AGI, and backpropagation/gradient descent as a training algorithm even less so.

link