Hacker News new | ask | show | jobs
by thom 775 days ago
This makes me wonder what you could achieve if instead of iteratively growing the grid, or worrying about pruning or regularization, you governed network topology with some sort of evolutionary algorithm.
2 comments

You can do much better by growing an AST with memoization and non-linear regression. So much so, the EVO folks gave a best paper to a non-EVO, deterministic algorithm at their conference

https://seminars.math.binghamton.edu/ComboSem/worm-chiu.pge_... (author)

Code for the curious: https://github.com/verdverm/pypge
Interesting, the use of grammar production rules reminds me of Grammatical Evolution[0], which has shown some promise in constraining the search space when using EAs for e.g. symbolic regression.

[0]: https://en.wikipedia.org/wiki/Grammatical_evolution

Much of what I did in my work was to reduce or constrain the search space.

1. Don't evolve constants or coefficients, use regression to find

2. Leverage associativity and commutativity, simplify with SymPy, sort operands to add/mul

So much effort in GP for SR is spent evaluating models which are effectively the same, even though their "DNA" is different. Computational effort, and algorithmic effort (to deal with loss of population diversity, i.e. premature convergence)

I've seen a few papers since pick up on the idea of local search operators, the simplification, and regression, trying to maintain the evolution aspect. Every algo ends up in local optima and works of effectively the same form by adding useless "DNA". I could see the PGE algo doing this too, going down a branch of the search space that did not add meaningful improvement. With the recent (~5y) advancements in AI, there are some interesting things to try

Believe there is a Google paper out there that tried that
1000s, there is a whole field and set of conferences. You can find more by searching "Genetic Programming" or "Symbolic Regression"

KAN, with the library of variables and math operators, very much resembles this family of algos, problems, and limitations. The lowest hanging fruit they usually leave on the proverbial tree is that you can use fast regression techniques for the constants and coefficients. No need to leave it up to random perturbations or gradient descent. What you really need to figure out is the form or shape of the model, rather than leaving it up to the human (in KAN)