Hacker News new | ask | show | jobs
by periheli0n 1850 days ago
This is about achieving Deep learning on Neuromorphic hardware. Large research teams have been working on it for decades. Billions of dollars/Euros/Pounds must have been poured into it. Still, their devices and algorithms get blown out of the water by an off-the-shelf GPU plus tensorflow, pytorch, what have you.

Hats off for the authors' achievement, this is no small feat and something that has been tried for years. But IMHO it's time that field moved on from running after matrix accelerators and focused on the real advantages of event-based computing: asynchronous, low-latency, event-based signal processing.

2 comments

It's not quite correct to say this is only for achieving deep learning. Gradient-based parameter optimisation is still a useful tool, even for small shallow networks that would be ideal for event-based signal processing.

Even for small-network tasks, training spiking networks has been non-trivial. This paper provides a way to get exact gradients, implying probably faster optimisation than using surrogate gradients or other approximation methods for SNNs.

You are totally right. The algorithm itself is a potential game-changer. I guess I was carried away by the pitch in the abstract that starts off with deep learning.

Personally I think that way too many resources were wasted on trying to make better deep networks with spikes. In my opinion it is much more promising to apply spiking networks on problems that are inherently event-based.

Having a functional backpropagation algorithm such as the one provided can help with that, obviously.

Based on reading just the abstract so far, it seems to me the event-based application of this algorithm makes absolute sense. Temporal importance can be effectively characterized in memristors. At the risk of making a comparison similar to that of Andrew Ng's a decade ago, I think this approach paired with something like a ReRAM crossbar is quite an effective rough analogue to the voltage potentials across a group of neurons in the brain.

I applaud this team's efforts. A real breakthrough.

interesting. where would one start reading about all this?
You could start with Intel‘s Loihi Press release: https://www.intel.com/content/www/us/en/research/neuromorphi...

There you get the full dose of hype for neuromorphic computing, but without any critical reflection (naturally, since it’s a press release advertising a product).

Unfortunately I am not aware of literature that provides critical review of neuromorphic computing. You have to read between the lines of the research papers to find out that the field has failed to live up to the promise of lower-energy deep learning (which was a misguided promise from the outset, IMHO).

Could you elaborate on why you think low energy deep learning was a misguided promise for SNNs? Just came across them for the first time last week and the low energy promise seemed like their most interesting aspect!
Deep learning is fundamentally linear algebra. Spiking networks are fundamentally event-based processors. The two concepts don’t play well together.

Many researchers have been trying hard to shoe-horn deep ANNs into spiking networks for the last 10 years. But this doesn’t change the fact that linear algebra is best accelerated by linear algebra accelerators (i.e. GPUs/TPUs).

Generally, spiking networks will likely have an edge when the signals they are processing are events in time. For example, when processing signal streams from event based sensors, like silicon retinas. There’s also evidence that event-based control has advantages over their periodically-sampling equivalents.

If you bring activation sparsity into the mix, the advantage of SNN processors over GPUs/TPUs becomes more clear. Loss-gradient-based optimisation approaches are great because they give you a tool to include e.g. sparsity regularisation into the loss. Encouraging sparse activity makes simple linear algebra a poor fit for network activation, and SNN processors a much better fit.
But is sparse activation sufficient to motivate the use of SNNs? In my opinion one needs a temporal component as well.

Sparse activations that don't also have a time component (i.e. are sparse in space and time) can be very well implemented without events.

Granted, SNN processors can handle sparse activations better than matrix accelerators. But then again, SNN accelerators might carry lots of SNN overhead that is not required for sparse activations alone.

Edit: A good example for a non-spiking sparse activation accelerator is the NullHop architecture [1].

[1] https://ieeexplore.ieee.org/abstract/document/8421093

I agree with these points, however the main advantage of the method presented in the paper is precisely that both the forward propagation and backward propagation can be seen as being performed by a network operating on temporally sparse events. We absolutely had event-based sensors and control as a motivation in mind. The fact that you can write down the connectivity of the neurons in terms of a weight matrix, does not mean that it can't be sparse. Since you are actually processing one spike at a time (potentially asynchronously), you don't need to implement any matrix multiplication. Current neuromorphic hardware achieves at least some degree of sparsity in their synaptic crossbars (BrainScales2, Spinnaker) or largely eliminates them like Loihi.
Ultra-low-power neuromorphic processors such as DynapSE[1] have been cross-bar free for several years now, making them a perfect fit for sparse networks (both weight- and activity-sparsity). [1] https://arxiv.org/abs/1708.04198
Yes, the algorithm you proposed is impressive and has the potential to become a game-changer.

However, I think the MNIST and the Ying/Yang dataset, using latency-coding, are not the ideal example to demonstrate its performance.

These datasets are useful to demonstrate nonlinear classification, and it's certainly great to see that the spiking network performs competitively. However, the transformation into a latency code costs time, in terms of computation, and also in terms of representation, before even one item is classified. Perceptron-based ANNs with continuous outputs don't require this step and will always have an edge over spiking networks in such scenarios.

I think what the field is really lacking is an ML problem that can leverage spiking networks directly, that does not require costly conversion of data into a representation that is suitable for spiking networks.

Does that include rain.ai ?
I don’t know too much about their technology and the website isn’t giving away too much detail. It doesn’t look like they are using spiking networks, so no event-based neuromorphic tech, but perhaps good old linear algebra/ANN ML. They’re using analog computation which is attractive power-wise, but in the past has always suffered from variability due to device mismatch. Unless they have some really revolutionary process or algorithm that magically makes the downsides of mismatch disappear, they’ll have a hard time going beyond what has been tried in analog computing before (and which had its heyday in the 70s).
Looks like a different approach. Intel's chip is based on digital circuits. They try an analog approach.
Here is a paper from the same group which includes actual results of an algorithm running on the neuromorphic chip: https://arxiv.org/abs/1912.11443
The EU Brainscales project built a wafer that runs 10k times faster than real-time,

https://electronicvisions.github.io/hbp-sp9-guidebook/pm/pm_...