| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by janalsncm 424 days ago

We typically would solve a lot of the same types of problems with RL today because it’s more efficient.

In EA if a candidate fails we throw it away. In RL we learn from that experience.

RL gets harder when rewards are really sparse. OpenAI developed evolution strategies which is a bit of a hybrid.