Hacker News new | ask | show | jobs
by cbarrick 3190 days ago
We say ANNs are "based on how the brain works" because the original mathematical model was an attempt by McCulloch and Pitts to explain how complex behavior arises from networks of simple neurons.

A neuron is either activated or not and each of the many inputs can be either excitatory (encourages activation) or inhibitory (discourages activation). McCulloch and Pitts formalized this as a weighted average of the inputs that was then thresholded to 0 or 1. And they showed some basic theoretical results from that that gave it some credit as a model for how intelligence can arise from neurons. Essentially they said behavior can be described as a classifier.

AFAIK, they didn't go much into how the weights were actually learned. Different strategies were tried, but we ultimately started to soften the threshold function into the logistic function (to make the network differentiable) and solve for the weights by gradient descent.

Modern Deep Learning makes the additional assumption that neurons in the same layer are not interconnected. This assumption, along with the fact that we're just dealing with weighted averages, allows us to describe networks in matrix form, allows us to compute the gradients with backprop, and allows efficient simulation on the GPU. This assumption is more practical than biological.

> show me how it relates to how [...] humans will behave?

[This page][1] attempts to connect the dots between the McCulloch and Pitts model, the resulting classifiers, and behavior. Essentially, the theory was that neurons can be formalized into classifiers, and behavior is just the output of these classifiers. I don't know too much about modern neuroscience, but given the amazing results we are seen these days in vision, language, and planning, I'd say the central ideas of the theory are still credible.

[1]: http://www.mind.ilstu.edu/curriculum/modOverview.php?modGUI=...

1 comments

First of all, neurons don't have just one activation function. Each dendrite has. So, anything from dozens to thousands. Second, that definition doesn't cover the entire issue of multiple feedback loops. Third, this doesn't cover memory effects at structural (cytoskeleton) and local levels (vesicles), much less generic levels (RNA and your genes). And then we haven't even gotten into metabolomic and epigenetic wriring in your neurons ...

Calling those chained regressions similar to the brain is about as correct as saying that a 3y old's drawing of a car is similar to a real Tesla...

I mean, doi.

McCulloch and Pitts published in the 1950s. Of course we know more about the brain now.

If I were to ask you "How does intelligence arise from a network of activations?" Would you genuinely say that it has nothing to do with the McCulloch and Pitts theory?

I would honestly say we really have no clue, and maybe add that as far as we can tell, activations play as much of a role in intelligence as a myriad of other factors.

But more generally, I am just so tied of this "brain metaphor" on deep learning. It is a funny way to wake up your students (well, at least 10 years ago it was...), but trying to stretch this metaphor much more than that is just painful. Heck, even the activating "functions" (plural, as we now know) in a neutron isn't really a set (!) of (singular, independent) functions, it's just a top level name for a mind-boggling number of things happening as neurons "fire", with a mathematical formalism to approximate what's going on. In fact, calling an activation a "function" is probably belittling the biological processes behind them.