Hacker News new | ask | show | jobs
by hishnash 3307 days ago
Yes, massivly deep learning is all about learning from data. With the work I have done I have seen that currently unless you buy a 2xeon server you can't get enough data through to justify haveing 4 GPUs on one machine. So commonly its cheaper to break out to multiple machines, but this then has an impact on how you code your learning since communication between the GPUs then is much slower.

Think of training a NN to learn cats expressions and such Meme txt to go along with pictures of cats. So 1) you collect as many cat memes as you can (easy,... 40k memes of cats later) you start training... you need to normalise all the images you need to extract the meme txt and normalise this and you need to feed this through your tenso flow (or other system) to train it. This means pusshing your small data sample of 40k through the GPU and back out and possibly doing this a lot, there is no way they will all fit on your GPU and even if they do you need to get them on there in the firrst place.