|
|
|
|
|
by junipertea
1986 days ago
|
|
I found training multiple models on same GPU hit other bottlenecks (mainly memory capacity/bandwidth) fast. I tend to train one model per GPU and just scale the number of computers. Also, if nothing else, we tend to push the models to fit the GPU memory. |
|