Hacker News new | ask | show | jobs
by feral 1444 days ago
> artificial neural networks aren't like actual neural networks or brains

Just to zoom right in on neural networks:

People often say this, and I never see a solid argument.

I know very little about biological neural networks.

Clearly they are very different in some respects, for example, meat vs silicon.

But I never see a good argument that there's no perspective from which the computational structure is similar.

Yes, the low level structure, and the optimization is different, but so? You can run quicksort on a computer made of water and wood, or vaccum tubes, or transistors, and it's still quicksort.

Are we sure there aren't similarities in terms of how the various neural networks process information? I would be interested in argument for this claim.

After all, the artificial neural networks are achieving useful high level functionality, like recognizing shapes.

2 comments

There are many ways one can argue for or against this comparison. This is mostly a matter of terminology. However the problem is that the field of AI has been for many decades consistently shaping its language to evoke human-like connotations in order to boost hype. This article's title is a yet another example of that.
There are a few conceptual differences where artificial neural networks conceptually diverge for computational reasons.

One is the notion of time and connectivity loops - overwhelmingly, ANNs use a feed-forward architecture where the network is a directional graph without loop and some input is transformed to some output in a single pass - and weights can be adjusted in a single reverse pass, which is very practical for training. We do know that biological brains have some behavior that relies on signals "looping through" the neurons, and that is fundamentally different from, for example, running some network iteratively (like generating text word-by-word via GPT-3). We have artificial neural network simulations that do things like this, and also simulations of "spike-train" networks (which can model other time-related aspects which glorified perceptrons can't), but we don't use them in practice since the computational overhead means that for most common ML tasks we can get better performance by using an architecture that's easy to compute and allows to use a few orders of magnitude more parameters, as size matters more.