Hacker News new | ask | show | jobs
by yobbo 2 days ago
Normal use means loading data into the GPU for each batch. The load is not even, though training might be worse than "production".