|
|
|
|
|
by bitL
2918 days ago
|
|
Try to run multiple models/ensemble training on many computers with many GPUs to pick up the best performing model or combo. TensorFlow so far has probably the easiest approach for it. That might be reason for the attitude "real deep learning engineers use Tensorflow", as other approaches either don't scale that well or you can't even model something you need for your bleeding-edge billion $-making approach, despite other frameworks being much much simpler/more natural and a joy to use. |
|
See e.g. [0] and [1] linked below.
For model ensembling, it's even easier. After training, in Keras you could simply load your multiple models and create a new Model() object that does nothing but use a merge layer (with mode set to averaging) to average across multiple input models, even if the models share layers or have other crazy constraints. Writing that final ensemble is extremely easy in Keras.
In my experience researching and productionizing very deep Keras models for an image processing use case that has moderately tight performance constraints, Keras has proved to scale extremely well and the code remains dead simple the whole time.
[0]: < https://blog.keras.io/keras-as-a-simplified-interface-to-ten... >
[1]: < https://www.tensorflow.org/programmers_guide/estimators#crea... >