Hacker News new | ask | show | jobs
by dahart 1254 days ago
> My fascination with these sequences began in 1964 when I was a graduate student at Cornell University in Ithaca, NY, studying neural networks. I had encountered a sequence of numbers, 1,8,78,944,13800,..., and I badly needed a formula for the n-th term, in order to determine the rate of growth of the terms (this would indicate how long the activity in this very simple neural network would persist). I will say more about this sequence in Section 2.1.

It’s really fascinating to bump into mentions of NNs from the 60s & 70s. They seems to be quite hot at the time. The paper on the Medial Axis Transform mentions neural networks too, in a way that makes it seem like it was the cool thing to do. By the time I was in college, NNs were very out of fashion.

Here’s the NN problem Neil was working on, and the first sequence in the database: https://oeis.org/A000435

1 comments

Yea neural networks were actually invented in the 40s by Warren McCulloch and Walter Pitts at University of Illinois at Chicago. They had a few isolated results until GPUs and distributed computation really kicked them into high gear and that made the change in terms to “deep learning” and now GPT-3 and other networks are hyperparamaterized neural networks with millions to billions of parameters .
I was part of a research group that extensively trained small neural networks for image-processing in 2001, the high-energy physics community had been using them for many years by that time.

Furthermore, I believe that the PalmPilot's handwriting-recognition engine also had a neural-network component.

Agreed that the usage has increased radically in the last twenty years, but even before the GPU-based revolution, it felt like neural networks were already broadly known and in use across the sciences and engineering. They were just slower :).

True, but scaling has its own problems. It was necessary to find better optimisers, activation functions, regularisers, weight sharing schemes, architectures and many other ingredients to make it work. And to prepare the large datasets, and invent the whole stack of frameworks, from CUDA to HuggingFace.

We have had 250,000 ML papers written since 2012. That's a lower bound on the number of distinct experiments necessary to find the winning tickets of today. Inventing the step-activated neuron formula was less than 1% of the way here.