| >genetic algorithms combined with RL where the genetics determine the reward function. I have been working on this problem for years (2+ as researcher, 2 as PhD student). The main issue is that evolution is both massively parallel and had plenty of runtime to get to human level intelligence. The person that pushes this evolution/evolved reward point is Andrew G. Barto and his students/collaborators over the years. Satinder Singh in particular is actively working on gradient based algorithms to find rewards (e.g. https://arxiv.org/abs/2102.06741) > Maybe we need to frame RL goals in much more simple terms, and allow genetic algorithms to evolve their own inputs and reward functions on their own. I was checking HN while the current iteration of this (gradient based, genetic was my master thesis) algorithm, the main complexity is figuring out: 1) What are the sub-goal e.g. grasping things
2) How to solve those goals e.g. motor control
3) How to do something useful, e.g. surviving Balancing those three processes is the current hurdle. For more info my email is delvermm at mila.quebec |