| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by logiduck 876 days ago

Yes, this makes it very difficult to apply ML and RL in non-simulated scenarios.

With simulated scenarios you can just replay and "sweep" across hyperparameters to find the best one.

In a realworld scenario with limited information, fine tuning hyperparameters is much harder as you quickly find yourself in local maxima.