| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cr0sh 2609 days ago

Part 2:

So lastly, AI. Something to know off hand is that AI encompasses many things historically, but if we are talking about today, then AI is basically focused on two areas:

* Machine Learning

* Neural Networks (Deep Learning)

Machine Learning can be thought of more as "applied statistics" - although that is a gross simplification. But, it is somewhat accurate, in that in statistics there are various known methods and algorithms that can be "trained" using a set of data, and that training, once completed, can allow these algorithms to decide on an output based on completely new data fed into them. Most of these algorithms and other methods can only be easily fed a few data points, and can only output either a "continuous function" (if you will); that is, a usually floating point number that represents some needed output (ie, if the method were trained on two data points of current temperature and current humidity, it might output a value representing the speed of a fan) - or they can output 2 or more (but usually only a few) "categories" to indicate that the input means particular things (taking our earlier example, maybe instead of fan speed, it would output whether to turn a fan on or off, or whether to open or close a window).

Neural networks, on the other hand, work closer to how "biological thinking" (well, more like the neurons and the networks they form) actually works. Basically, you have a bunch of different nodes, arranged in layers; think of a simple three-layer network...

The first layer would be considered the "input layer" - it could consist of a few nodes, to hundreds of thousands or more, depending on what is being input into the network. The middle layer is usually much, much bigger than the input layer. It is also termed "hidden", in that it isn't directly interacted with. Each individual "input node" in the first layer is connected to every individual neuron in the hidden layer. So, input node 0 is connected to hidden layer neurons 0...n, and input node 1 is connected to hidden layer neurons 0...n and input node i is connected to hidden layer neurons 0...n.

As you can see, it's a very tangled, but organized web that forms between those first two layers.

The third layer is similar, except it is formed of a scant few neurons; it's known as the output layer (a brief note here - it is possible to have more than one hidden layer of neurons, but mathematically, from what I understand, multiple layers are no different than one large single middle layer - that's just my understanding).

The output layer could have only a single neuron; it's value could again be that "continuous function" I mentioned earlier. Or it could be multiple neurons, each representing a "classification" of some sort; thus, the layer could have anything from one neuron to potentially hundreds, depending on what you are training the neural network for.

Let's say your training a network to take a simplified image of a road, and transform that into an output to be fed into the steering system of a vehicle. Your input node layer might consist of say 10000 nodes (for a 100 x 100 b/w pixel image). Your middle layer might consist of 10x that number of neurons, and your output layer could consist of a single neuron (outputting a continuous function representing the steering wheel angle) or a set of discrete neurons representing a class of various steering wheel positions (hard left, soft left, left, straight ahead, right, soft right, hard right).

You'd train this network on a variety of images, and it would output (hopefully) the correct answer for driving the vehicle around based on images and what was done in response. Essentially, the images it is given to train on would consist of individual black and white images of a roadway, and what should happen at that point (keep going straight, or turn in some manner). If, during training, it does the wrong thing, an error amount is calculated, and that is used to update numbers within the neurons that make up each layer (output and hidden - there are no neurons in the first layer), to make their calculation the next time more accurate. This process is called "back-propagation".

Interestingly, at a very base level, the computations being performed by a neural network, aren't much different than those from "classical machine learning" algorithms, but because their inner workings, which for the most part are fairly opaque to study, are extremely complex, they allow for things which the classical algorithms couldn't touch, namely the ability to feed in very large numbers of inputs, and get back out very large numbers of outputs.

Given enough "labeled" training examples, this process works extremely well, but for there to be usefulness for a variety of tasks, such networks need to be composed of hundreds of thousands to millions of neurons, each connected to each layer above and below, and it take immense amount of computational hardware (in the form of GPUs, usually) and power to do so. It also needs a boatload of training examples. All of these needs is why neural networks, despite being played with in various ways for well over 60 years, didn't really take off with promising and useful results until very recently, when the amount of data to be had, and the computational ability to process that vast amount of data became available (again, GPUs). Basically a perfect storm. This is all known as "deep learning".

Now, I'm going to leave you with something to ponder about:

We are doing something wrong. For one thing, as far as anyone has been able to determine, the idea and workings of "back-propagation" has no biological analog. Back-propagation is something that only happens in the realm of these artificial neural networks, and does not occur (as far as we know) in a natural neural network.

It is also very, very computationally intense - ie, it sucks a lot of power. We haven't even scratched the surface of building a very large scale artificial neural network, and already what we currently have takes a ton of power, well beyond the very meager power consumption of a single human brain.

It should be noted, if it wasn't apparent earlier, that the model of the neurons used in an ANN (artificial neural network), is grossly simplified to the actual workings of real neurons. There are studies and systems out there that seek to implement and study ANNs using models which more closely represent how actual neurons work (ie - spike trains, things that happen at the synapse level, etc) - but they take even greater amounts of power to run, and can't be used for much more than research.

You can take this and still be "skeptical of AI", but that would also dismiss the vast strides we have made in the last decade or two in the AI/ML field. I also hope I've been able to show or allude as to where/how there is correlation between AI (well, the ANN part of it) and "biological thinking".

I hope this helps in some manner to explain things and to lessen the confusion as to what everything is and what it means...