Hacker News new | ask | show | jobs
by lostmsu 2022 days ago
This is a very bad and unprofessional article. Given his test code, dataset size and training time I am sure if he'd check his GPU load, it would be under 4% because all the time would be wasted moving data to and from GPU.
1 comments

It is my understanding that the M1 chip has unified CPU/GPU memory which means that Metal as the underlying framework might be clever enough to not copy the data at all. Not sure it applies to his use-case though.
I was mostly talking about RTX 2080Ti which he is comparing against.

It's like you're moving just across the street, and loading every single box into a car, crossing the street, then unloading the box instead of just walking on foot. You need to drive further (bigger networks) and load more boxes at once (batch size) for a car to actually be useful in this scenario.