|
|
|
|
|
by moritzmeister
2618 days ago
|
|
I am working on a little python framework to efficiently distribute hyperparameter search on a Spark cluster. We haven't released the first version yet but will do so in the next two weeks. https://github.com/logicalclocks/maggy A limitation of existing hyperparameter search algorithms is that they are typically stage or generation-based. For example, if genetic algorithms are used for hyperparameter search, one has to wait for all models to finish in order to generate a new generation of potential parameters from the best performing individuals. However, some instances will have suboptimal parameters during a given iteration and will know quickly during the training that they can stop early. Hence, the early stopped machine can’t be provided with a new set of parameters early but is instead idle. Compared to stage-based algorithms like genetic optimization algorithms, maggy (the framework) will support asynchronous algorithms, that are able to provide new candidate sets of parameters as soon as a worker finishes evaluating a combination and does not have to wait until all models in one stage finish. For this to be possible, we establish communication between the driver and executors in Spark. The driver will then collect performance metrics during training which enables us to stop badly performing models early during training and reassigning the executor task with a new, more promising set of parameters (new trial) right away, instead of waiting for a stage to finish. |
|