Hacker News new | ask | show | jobs
by jakobson14 970 days ago
That's later backfill, a retroactive change to give a manufactured "biological" origin story. Whether they're real or not, researchers love a good "we took this from nature, isn't nature wonderful!" explanation.

The C in CNN isn't "Convolution" for no reason. It came from work with convolutional filters (yay Sobel kernels!) which at it's height became filter banks and gabor filters and so on before neural networks pretty much killed off handcrafted feature development. Every explanation of how CNNs work still falls back to the original convolutional kernel intuition.

2 comments

> The C in CNN isn't "Convolution" for no reason.

The first N in CNN is "Neural" for a reason.

Can you explain that reason?

Decision trees are called 'trees' for, more or less, the same reason.

ie., the diagrammed shape of a decision tree looks a little like the branches of a real one.

likewise, in the 50s where diagramming the earliest networks they were aiming to immitate a similar real-world structure.

Better that they had called them 'Variable Activation Networks' or some such, and none of this superstition would have started

> Better that they had called them 'Variable Activation Networks' or some such

But that's the thing: they didn't. Instead, they called them "neural networks". It wasn't random.

It feels like part of the field now wants to pretend it was never about how to make a machine think. "No, we're only doing abstract maths, only going on self-contained explorations of CS theory." Yeah, right. That feels like a reaction to the new wave of AI hype in business. Now that the rubes are talking about thinking machines again, better distance themselves from them, lest we be confused for those loonies.

Thing is, the field was always driven in big part by trying to catch up with nature. It took inspiration from neuroscience, much like neuroscience borrowed some language from CS, both for legitimate reasons. A brain is a computer. It's precisely where the CS and neuroscience have an overlap - they're studying the same thing, just from opposite directions. It's just silly to play the "oh my field is better and your field doesn't know shit" game.

> Decision trees are called 'trees' for, more or less, the same reason.

Decision trees are called after the data structure, which is a way to express a mathematical object, which is older than CS and got that name from... who knows, but my money is on "genealogical tree", which itself is called a "tree" because people back then liked to tie everything to trees (symbol of growth) and flowers and cute animals (symbols of making babies).

The field inherited "trees" from the past. "Networks", too. But "neural" - that was a modern analogy the field itself is responsible for.

Yep! Trees, tree structures, tree diagrams have been regularly in use since the 1700s as a way of defining relationships. https://en.wikipedia.org/wiki/Tree_structure

There’s also a pretty large link between the formal representation of language using syntax trees, which was being formalized by linguists and by programming language developers around the same time: https://en.wikipedia.org/wiki/Formal_language?wprov=sfti1

You can use that argument for anything you disagree with. Do you have a source or anything?
Have a read through the first paper describing a convolutional neural network, from 1998: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf

There's absolutely no mention of biological inspiration whatsoever. At the same time, one can point to a long and rich history of convolutional filters being used in signal processing. And then there's the name, Convolutional Neural Network. The entire concept of a CNN is framed as a series of learned filters.

That is definitely not the first paper describing a CNN. That is not even the first paper by Le Cun describing CNNs (he was already on them as early as 1989[1]).

Regardless, Le Cun is not the first to describe CNNs, merely one of the first to use them for OCR (specifically for hand-written text).

The first neural network arch to use convolutions instead of matmuls was this[2], from the year of our lord 1988. This in turn is based on Fukushima's "neocognitron"[3] (1980), which is based on the visual cortex of felines (from work done by Hubel and Wiesel in the 50s/60s).

I guess it is not super surprising you might be confused – Le Cun seems a bit more reticent than average to cite the work he's building on top of, and when he does it is frequently in reference to his own prior work. So if that is where you're getting your picture of artificial neural network history, your skewed perception makes sense.

[1] https://ieeexplore.ieee.org/abstract/document/41400

[2] https://proceedings.neurips.cc/paper/1987/file/98f1370821019...

[3] https://www.cs.princeton.edu/courses/archive/spr08/cos598B/R...

Thanks, I was looking for something to do with early work and saccades, didn't find that, but found this;

"The most influential of these early discussions was probably the 1943 paper of Warren McCulloch and Walter Pitts in which activity in neuronal* networks was identified with the operations of the propositional calculus. Actual simulations of recognition automata based on networks were carried out by Frank Rosenblatt before 1958 but the theoretical limitations of his "perceptrons" were soon pointed out by Marvin Minsky and Seymour Papert"

excerpt from a 1998 paper, "Real Brains and Artificial Intelligence" (https://www.jstor.org/stable/20025142)

"Walter Harry Pitts, Jr. (23 April 1923 – 14 May 1969) was an American logician who worked in the field of computational neuroscience.[1]"

'https://en.wikipedia.org/wiki/Walter_Pitts'

I don't know why I'm still responding to this thread 24 hours later, but just thought I'd add this tweet from Le Cun: "Neuroscience greatly influenced me (there is a direct line from Hubel & Wiesel to ConvNets) and Geoff Hinton. And the whole idea of neural nets and learning by adjusting synaptic weights clearly comes from neuroscience."

https://x.com/ylecun/status/1583872918634655744?s=20

Surely you are trolling me now. There is a very clear biological inspiration mentioned in this paper: they literally define a CNN as having “receptive fields” and then they cite the same Hubel & Wiesel research mentioned before multiple times. LeCun mentions their research in papers even earlier in the 80s as well, during which they were awarded the Nobel prize for their research on the visual system. Of course there is also a lot of computational and mathematical research that was ongoing simultaneously, but to say that there is “no inspiration whatsoever” is pretty far from the truth.
Some time around mid 1995 until basically now, it became out of fashion to explain your motivations of some new modeling as inspired from biology, as that was often handwaving with only little understanding of the actual neuroscience. So that is why people stopped writing that in papers. Just let the actual performance numbers speak for themselves. Either you get good performance, then it doesn't really matter where this was inspired from, or it does not work well, then it also does not matter where this was inspired from. In machine learning, it mostly matters whether it works well or not.
That's funny. I had a book on "neural nets" in the 1980s, and it mentioned the analog to brain neurons.