| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by eggy 3725 days ago

I started reading about ANNs in the 1980s, and had similar confusion to those here, since it was just for fun. I suggest reading a basic book or online information that goes over the basics [1]. I struggled through $200 text books, and jumped from one to the other as an autodidact. I am now studying TWEANNs (Topology and Weight Evolving Artificial Neural Networks), which basically are what you see here with the exception that they are able to not only change their weights, but also their topology, that is how many and where the neurons and layers are. ANNs (Artificial Neural Networks - as opposed to biological ones) can be a lot of fun, and are very relevant to machine learning and big data nowadays. It was exploratory for me. I used them for generative art and music programs. Be careful: soon you'll be reading about genetic algorithms, genetic programming [2], and artificial life ;) Genetic Programming can be used to evolve neural networks as well as generate computer programs to solve a problem in a specified domain. Hint: You'll probably want to use Lisp/Scheme for genetic programming!

  [1] http://natureofcode.com/book/chapter-10-neural-networks/
  [2] http://www.genetic-programming.com

4 comments

argonaut 3725 days ago

As far as the recent deep learning boom is concerned, genetic programming is really out of favor. I don't really see it in any of the deep learning (or even machine learning, for that matter) literature/successes/research groups.

"Neural networks" are a really really overloaded term. A ton of stuff referred to as "neural networks" has little to do with the "neural networks" that are used in the machine learning community.

eggy 3724 days ago

You're spot on about genetic programming. I am a self-taught person who plays with anything that strikes my fancy; I learn by playing. I read all three volumes of the Artificial Life series from the Santa Fe Institute at the time (now there are more), and went in many directions in the 1990s - Fuzzy Logic, Expert Systems, ANNs, and Evolutionary Computation (GA (Genetic Algorithms) and GP Genetic Programming), and AL (Artificial Life) all fascinating. I found, and still find, genetic programming attractive even if it has not found its niche in the ML community. I think the CI (Computational Intelligence) community at large will eventually develop well-fitted uses for it. I was trying to use an FPGA and Koza's modified GP code to have the FPGA re-program itself as a GP evolved a better program than I originally wrote to kickstart it. I didn't get too far. This was 1996-97 though. Pretty much on my own then, not really much of an Internet to find information, especially esoteric information, or cheap many-gated FPGAs. Outside of ML, GP has found moderate success. One example is this paper (sorry behind paywall, so only the paper title here), that started with using expert data, tried ANNs, then ANNs and statistics, until it used a GP approach:

"A Computational Intelligence-Based Genetic Programming Approach for the Simulation of Soil Water Retention Curves"

I also use the term ANNs over just NNs to keep it to the silicon, and not wetware ;) Although, they did hook up a small ANN to a cockroach once, IIRC...

extrapickles 3725 days ago

It has its niche applications. The only non machine vision application that comes to mind is one[1] that takes a pile of data, and evolves a model that fits it.

Generally were its actually being used they are a bit quiet on how they go about getting the results they do. While the genetic bit is easy, the secret sauce is in guiding learning/evolution that work for the particular problem domain.

[1]: http://www.nutonian.com/products/eureqa/

argonaut 3725 days ago

Yes, but all of the algorithmic advances in academia, and most of the advances at Google/Facebook, have been out in the open.

eggy 3724 days ago

Yes, it is a shame people don't share their advances in science and technology for fear of losing market share usually. Sharing grows the market, and then there's more pie for everyone, and more work gets done to advance the field. Still that point, nor how successful GP is in the ML community, measures its current or future potential. The book I am working my way through now, in LFE (Lisp Flavored Erlang vs. Erlang, or Elixir), is "The Handbook of Neuroevolution Through Erlang" by Gene Sher [1]

Gene covers a lot of ground. Somebody has done some transliteration to Elixir too; I use LFE, since staying with Lisp bridges the gap between my GP work, and what Gene has done with Erlang and ANNs and EC. For GP, you really need to be able to create new forms with macros, or it is more in line with GP. To quote and excerpt from Robert Virding, co-designer of Erlang, and creator of LFE,addressing Elixir's macros or messing with Erlang's modules vs. LFE's or Lisp's macros on HN before:

  "There is syntactic support for making the function calls look less like function calls but the macros you define are basically function calls.

In Lisp you are free to create completely new syntactic forms. Whether this is a feature of the homoiconicity of Lisp or of Lisp itself is another question as the Lisp syntax is very simple and everything basically has the same structure anyway. Some people say Lisp has no syntax." [2]

  [1] http://www.erlang-factory.com/upload/presentations/536/ErlangConferencePresentation_2012.pdf

  [2] https://news.ycombinator.com/item?id=7623991

ylem 3725 days ago

Just curious then, how are people optimizing network topology?

argonaut 3725 days ago

GSD, also known in the literature as "Graduate Student Descent."

I'm not even joking. Trial and error. Having good "intuition" about past ideas the basic building blocks to guide that trial and error. Reading research papers and seeing what other people did well with and using that.

As an aside, this is the principal reason I am skeptical of grandiose claims about deep learning.

samscully 3724 days ago

Regularisation methods like dropout are often good enough that you can build a network with too many parameters (for the amount of data you have) and rely upon the regularisation to find the subset of that network that is actually useful. People have recently got good results from also randomly dropping weights, or even whole layers.

haddr 3724 days ago

Probably also through some grid search. I've read (but not rememeber where) that Random Search gives very good results, even better than grid (in less time).

wjnc 3724 days ago

Any thoughts on why genetic programming is not 'in fashion'? Does it have anything to do with complexity of the calculations?

I can imagine that the advanced models use many, many machines and only deliver results after a large training time. Genetic programming is not feasible then, if you cannot get a quick grasp of the potential results of a model.

argonaut 3724 days ago

At least for deep learning, most deep learning models take more than a week to train, often on multiple GPUs. Some of the extremely deep, huge dataset models can take multiple weeks on multiple GPUs. Google trained AlphaGo's nets for months (on god knows how many GPU/CPUs). Suffice to say, people don't even bother touching most hyperparameters, let alone trying to do something more exhaustive.

DavidSJ 3724 days ago

If your program is a neural network with N parameters, or a program tree with N nodes, then testing against data takes O(N) time. With evolutionary computation, what you get for your trouble is a single real number -- the loss: how bad it did. With neural networks, backpropagation gives you N real numbers: the gradient of loss with respect to each parameter.

Put another way: with evolution you have to stumble around blindly in parameter space and rely on selection to keep you moving in the right direction. With the gradient descent that neural networks use, you get, essentially for free, knowledge of the (locally) best direction to move in parameter space.

The bigger the models, the more this matters. Modern neural networks have millions or even billions of parameters, and that's been crucial to their expressive power. Good luck learning a program tree with a billion nodes using evolution. It might take 4.54 billion years.

daveguy 3724 days ago

> It might take 4.54 billion years.

And then only if you have a system powerful enough to accurately simulate a planet full of molecules.

Although I do think there is a balance between GA and structured NN which will lead to faster and better results than the deep NN alone. We already see some of the best deep NNs incorporating specific structures.

eggy 3723 days ago

I think neural networks and other forms of evolutionary computation will merge as I have been writing in my other replies in this thread. TWEANNs incorporate EC into evolving ANNs. The other article I cited above on soil mechanics, beat out expert systems, ANNs, statistics, and used GP. MEP, or Multi-Expression Programming for GP incorporates being able to put more than one solution into a gene without increasing the processing times thereby overcoming the inefficiencies of 1990s-era GP. Here is a recent article using it that is not behind a paywall or via sci-hub.io [1]. It needs better editing, but there are other references if you search for Multi-expression Genetic Programming.

  [1] http://benthamopen.com/ABSTRACT/TOPEJ-9-21

eggy 3724 days ago

First, the right tool for the job. ANNs are able to be a general function approximator with sufficient training to be a cost-effective choice to implement. Second, ANNs have been around about 35 years longer than GP. The TWEANNs I am studying, and that I already mentioned in a previous reply in this thread, hybridize ANNs and EC (GAs and GP), so if you include Neural Networks that utilize Evolutionary Computation techniques to modify weights or topology, then GP is being used to an extent. Replication as a variable in EC is the key force in biology, and I only see more use of EC techniques to enhance the general function approximators that are ANNs. Further, there are also hybridized computing machines that have been made, and are being made with FPGAs and GPUs. Finance and supercomputing are just two areas that are looking to utilize them. In some, the FPGAs are simply there for updating special computation programs that feed the GPUs. There is some research with a GP optimizer updating the FPGAs and then using the GPUs for the massive parallelization of the computations.

wjnc 3723 days ago

Thanks eggy, awesome replys. You should write some of your experiences down if you find the time.

nabla9 3724 days ago

Evolutionary algorithms and genetic programming are global optimization technique, basically random search with some memory. It's not "out of fashion" any more than simulated annealing or Monte Carlo methods. They have limited usability, that's all.

AgentME 3724 days ago

>Topology and Weight Evolving Artificial Neural Networks

I brainstormed for a while about using genetic algorithms to decide the network topology. I'm glad someone else invented that already! Less work for me to do now.

matheweis 3725 days ago

Okay, that is straight up awesome. I've been toying with neural networks just enough to get a basic understanding of what they are and how they work, and it occurred to me that something like this might be possible.

Of course, I wasn't up-to-speed enough to know the right terms to look for, so thanks for sharing. :)

I am curious though... it seems like it would take orders of magnitude more computing power to not only train but evolve and re-train the networks. Is this practical with today's hardware?