Hacker News new | ask | show | jobs
by logiduck 876 days ago
Yes, this makes it very difficult to apply ML and RL in non-simulated scenarios.

With simulated scenarios you can just replay and "sweep" across hyperparameters to find the best one.

In a realworld scenario with limited information, fine tuning hyperparameters is much harder as you quickly find yourself in local maxima.