Hacker News new | ask | show | jobs
by esac 1804 days ago
>genetic algorithms combined with RL where the genetics determine the reward function.

I have been working on this problem for years (2+ as researcher, 2 as PhD student).

The main issue is that evolution is both massively parallel and had plenty of runtime to get to human level intelligence.

The person that pushes this evolution/evolved reward point is Andrew G. Barto and his students/collaborators over the years.

Satinder Singh in particular is actively working on gradient based algorithms to find rewards (e.g. https://arxiv.org/abs/2102.06741)

> Maybe we need to frame RL goals in much more simple terms, and allow genetic algorithms to evolve their own inputs and reward functions on their own.

I was checking HN while the current iteration of this (gradient based, genetic was my master thesis) algorithm, the main complexity is figuring out:

1) What are the sub-goal e.g. grasping things 2) How to solve those goals e.g. motor control 3) How to do something useful, e.g. surviving

Balancing those three processes is the current hurdle.

For more info my email is delvermm at mila.quebec

2 comments

Also, evolution isn’t trying to get to human level intelligence. It’s just one out of millions of adaptations that work, it’s recent, and it’s rare. Change Earths parameters a little over the past several million years, and maybe we don’t evolve.
It's an extremely open-ended process.
> The main issue is that evolution is both massively parallel and had plenty of runtime to get to human level intelligence.

How many entities are we talking about for substantial evolution? I know that there have been 100 billion "humans" (not that it's so clear-cut) alive, so guessing this is on the order of ~trillions of entities to simulate some evolution for (but maybe I'm really underestimating the early tail of tons and tons of microorganisms and small short-lived life that got us to this point).

Is the bandwidth of evolution that much larger than what we could possible simulate with computation, especially for a much simpler world/task than "generally survive"?

Given that a single teaspoon of soil probably has about a billion bacterial organisms in it, I suspect you're a couple orders of magnitude short.