|
|
|
|
|
by xg15
2966 days ago
|
|
Sounds like the "classic" problem where someone wants to build a reinforcement-learning system (because "self-improving AI" sounds so cool) but don't actually have a suitable reward function that would describe their problem. Nevertheless, they don't let themselves be caught up by this minor obstacle and use whatever random reward function they can implement with the data they have. The resulting system won't actually learn to solve the original problem - but it will learn something, so, hey, it's self-improving! See also: Probably every single recommender system in use. (At least that's my subjective impression) |
|