Hacker News new | ask | show | jobs
by jacquesm 3336 days ago
It would be interesting to see exactly how much data is transferred across the bus during training. Of course it would be great if you could fit your whole dataset on the other side but typically a GPU during training will max out at batches of anywhere from 8 to 64 depending on image size and number of channels. So you'll be moving quite a bit of data.

If found a tool to monitor this unfortunately it only works on Xeons and not on what's in my desktop.

https://github.com/opcm/pcm