Hacker News new | ask | show | jobs
by Nevermark 1656 days ago
I didn't say anything about the singularity.

My entire career has been AI. So you are right, machine learning today is dominated by gradient based, and line based, searches.

And I understand what you are saying about distributed computing's inherent limitations.

But, it's not all about increasing the amount of computing (although that will continue to be a big factor for many years). Better organized computation is continually producing better results with less computing too.

Keep in mind that our brains take about the same effort to learn as to operate. Machine learning models operate with an incredible efficiency, and a minuscule amount of computing than when being trained. Models trained on massive cloud resources can be run on embedded processors, phones or smart watches.

Improvements to gradient/line searches accrue across virtually all of today's machine learning, so will continue to be researched and improved.

In the past, "simple" things like convolution, the right way to stage layers, etc., have dramatically improved the results and reduced model complexity in ways our neural circuits are unable to match. (Convolution reuses weight values across many virtual neurons. In our brain all those neurons must be real and independently learn to behave similarly.)

These days novel ways of multi-target training have not just expanded the types of problems machine learning is good at, but also reduced model sizes in ways our brain's networks are unlikely to be able to do. There is no limit to how many performance derivatives, from different trained outputs, can go through a machine "neuron" either changing that neuron's weights or going on to change other weights.

Generative Adversarial Networks use multiple target training. There is no end in sight yet on the kinds of things that having multiple performance targets operating in different subsets of weights can do. It is a massive booster for many problems that would be difficult or unattainable otherwise today.

Model reuse will be a massive savings in training time. Standard blocks can be trained through to other blocks. They don't make sence as long as every major retraining effort produces a better model, but at some point a lot of models or parts of models will be pretrained.

Finally, the value of improvements to machine learning are now colossal. So the resources put into improving them are colossal. A trained system can be used throughout a company, or sold as a product to any number of customers. A trained human ... not so much. So where a machine can match a human, the machine version is far more valuable.

Well we will see.