| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by null000 2587 days ago

The whole point is that the algorithm doesn't know about obstacles or success as a concept baked into the algorithm. Likewise, this is pretty initial research, meant to inform and promote

In other words, this isn't meant to be super useful by itself. It seems tailor made (as many of these things do) to play super-simple 80's video games and literally nothing else, but it's an interesting proof of concept. I'd also be interested in different iterations on this general pattern - for instance, something that didn't translate directly from screen + button -> prediction, and instead had some interstitial systems - translating from screen -> entities, then predicting entity state of entities given button presses. It'd also be interesting to see how this performs with ML algorithms designed to learn on the fly instead of through training from a static set of data (at least, this looked like it learned through back propagation - I skimmed).

But I can see broader practical applications for this in, for instance, recommender systems trying to break users out of the closed feedback loop that people tend to end up in when going down certain rabbit holes (e.g. watch one Flat Earther conspiracy video and suddenly that's all you see for a week because the recommender system knows that people who look at one will look at more). The point being: the real test comes when this strategy is exposed to more diverse problem spaces, it's just that those are harder to model and we need to weed out the pointless stuff first.