| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zopf 2927 days ago

I still think the defining moment for ML inference (and maybe even training!) on embedded devices will come when there are viable special-purpose, low-power ML chips.

As much as I hate to do this, I'm going to make a comparison to Bitcoin mining.

Mining is all about optimizing hashes/joule to get the best ROI. We watched it go from CPU -> GPU -> FPGA -> ASIC in the quest for efficiency.

In ways, we're seeing the same thing in ML model training and inference. CPU -> GPU -> TPU. We're even seeing some special-purpose coprocessors deployed, as in the iPhone X. (https://www.wired.com/story/apples-neural-engine-infuses-the...)

But I think the final leap will come by going from digital execution to application-specific analog computing. If you don't need high precision, you can compute extremely quickly and efficiently using properly-configured analog circuits.

IBM is working on this kind of system with their TrueNorth line (https://techcrunch.com/2017/06/23/truenorth/)

It hasn't been proven yet, but I think there is huge potential.

2 comments

alfalfasprout 2927 days ago

I remain unconvinced we'll see ASICs dominating inference. Part of the problem is that even if we're just talking about neural networks, there's a variety of architectures, activation functions, etc. to consider. At this stage, from my own benchmarking Nvidia is close enough to the TPU with the V100 card while allowing much more flexibility in the software stack used.

For inference, GPUs are also pretty damn efficient since it's an embarrassingly parallel task w/ minimal synchronization (no gradient updates needed). In this case, FPGAs are a far better choice since you can push updates to accommodate new network architectures, activation functions, ,etc. The TPU instead relies on a matrix-multiplier unit which supports more use cases but won't be as performant on something like an RNN.

link

cbHXBY1D 2927 days ago

I think Microsoft's experience with FPGAs for inference would agree with you.

Currently, they are only allowing external customers use ResNet-50 with their FPGA-enabled Azure ML.

link

p1esk 2925 days ago

TrueNorth is 100% digital.

link

zopf 2919 days ago

After some investigation, you are correct! Knowing that some of TrueNorth's creators previously worked on mixed-mode systems, I made the assumption that this one was too.

It seems the TrueNorth is indeed fully digital, but takes advantage of the event-driven architecture and peer-to-peer communication between many tiny cores to keep things low-power.

( http://paulmerolla.com/merolla_main_som.pdf for some details )

Thank you for the correction!

link