Hacker News new | ask | show | jobs
by twotwotwo 15 days ago
We have a lot of synapses, but (agreeing with you) I don't find that sufficient to explain why humans (or animals!) do what we do. If you throw zillions of parameters at a problem with a weak architecture, you get really high-fidelity memorization, and we're not awesome at memorization compared to machines.

Humans can do an impressive amount of generalization from one error or surprise, and as is often rightly noted, don't need trillions of words to get going. And it all seems to happen some 'forward-only' way, without backpropagation -- we don't have AdamW or MuonClip helpfully nudging our synaptic connections towards whatever would have scored well on our most recent test. It is relevant that we're creatures with goals -- reinforcement learning is the only stage where there's a taste of that for neural nets -- but the learning differences seem at least partly independent of that.

I suppose it could turn out that, even if not sufficient, the large number of synapses is necessary to all this, like we're effectively buying a lot of lottery tickets that give us a shot at fishing interesting hypotheses out of the experiences flowing by. But I'm still awfully suspicious that we don't have the right mathematical model for learning messy ideas all worked out yet.

1 comments

There is actually a way to get really amazing sample efficiency out of a learning setup, and that's engineering in a load of appropriate inductive biases, which personally I am convinced evolution has done for us. Explains a big chunk of the "how are brains so sample efficient" problem really easy, but unfortunately without handing us an easy way to replicate it, which makes it unpopular. Also, it's something that we don't really want to do in the same way evolution has, as all those biases do even further reduce sample efficiency for all the things for which they are not appropriate.
In a nutshell this is what statistical learning theory says. For any dataset there is an optimum given a prediction task. It follows from entropy. As the commenter pointed out “evolution has this backed in”. There once was a research direction of evolutionary distribution estimation algorithms but basically we know nothing about evolution, and scaling ede to multidimensional data is much harder than optimising objectives and trying to squeeze the inductive bias. For all it’s worth I think much of the current AI research is focused entirely on the wrong questions. Can machines learn? Sure, inductive bias FORCES them to learn. Given basically unlimited data can computers pick appropriate inductive biases to do anything useful, “survive” if you want to call it that… probably not, at least no one has really asked these questions for a couple of decades