|
|
|
|
|
by wtallis
1938 days ago
|
|
Consider adding some affordance for parallel computing to Gradient-Free-Optimizers by allowing the user to provide a vectorized objective function instead of one that evaluates only a single point in the search space per function call. That leaves all the hard work of parallelization as an exercise for the user, and gives the user the flexibility to parallelize their objective function with whatever mechanism they wish. I have previously used this approach in a project where the objective function contained a half-hour long simulation, which was the bottleneck that made estimating a gradient intractable. When the optimization algorithm gave a batch of several points in the search space to evaluate, our objective function could prepare and run several instances of the simulation in parallel, and return when the whole batch was complete. From this, it was easy for us to also distribute simulation runs across several machines, without needing any changes to the optimizers. We would not have been able to easily achieve this with an optimization framework that tried to directly manage parallelization, because the steps necessary to prepare the input files for the simulation software had to be done serially. For that project we tried: DIRECT, several variants of Nelder-Mead, and an evolutionary strategy. In hindsight, the Nelder-Mead variants worked best; once we accumulated enough simulation results it became clear that our objective function was convex and pretty well-behaved in the region of interest. Nelder-Mead was also trivial to extend to trying several extra points per batch to ensure that each of our several workstations had something to work on. (We didn't have access to a large cluster, and Nelder-Mead wouldn't generalize well to a large degree of parallelization in that manner.) |
|