Hacker News new | ask | show | jobs
by ashish01 1065 days ago
Any good examples or notebooks using RL to solve typical optimization problems?
1 comments

I also am curious about this statement. If a problem becomes too complex for linear programming solutions, how easy is it to know that a reinforcement learning solution is actually a global optima and not just local?
> how easy is it to know that a reinforcement learning solution is actually a global optima and not just local?

If the problem is too complex for LP then you're probably not going to get a global optimal from RL either.

Fair enough. I'm not invested in LP or RL, but my background is in agriculture and LP forms the basis of a lot of cropping/feeding decision trees. I'd be curious to explore new techniques, but it sounds like the discussion here is for a different set of linear programming problems than the ones I see most often, and that there's likely a limited upside toward implementing RL.
If you have a working LP solution without any glaring compromises in the problem formulation, then I'm not sure why one would want to throw out a perfectly good working solution... algorithms are the means, not the ends :)
I'm pretty sure the main reason for throwing out old tools that still function in favor of the new hotness is developer boredom.
This is why my teams always provide explicit opportunities and spaces for professional development. People should have the opportunity to stretch and grow; if you don't provide those opportunities explicitly, then your most motivated employees will find them implicitly. And you can't afford to not keep your most motivated employees, so you'll end up paying with tech debt instead of Engineeer prof development time.
While I disagree with the RL assertion without a source, linear programs are convex, so local optima are global optima.

However, unless there is some aspect of the problem which is not known (e.g., you don’t exactly know the objective or constraints), so you model it as a distribution over LPs, I really don’t know how RL will help you. Gradient-based methods can give you improvements if your problem is very large scale and doesn’t have, e.g., sparse structure, but the above claim is bold.

That's an excellent point. I think one complex agriculture issue I've encountered that LP would struggle to handle is something like inventory control of perishable goods, where there's variance in incoming quality, degradation during storage, and variance in shelf-life based on age of crop, growing conditions, etc. LP would work great for handling grain storage, where a crop generally maintains a known quality profile over time with limited shrinkage, but I could see a use for a machine learning algorithm that handles perishable goods stochastic variables in a more direct way.