Hacker News new | ask | show | jobs
by NaiveBayesian 1056 days ago
If your data loading pipeline grows even slightly complex, then yes, you absolutely need concurrency in order to deliver your samples to the GPU fast enough.

The current workarounds to make this happen in python are quite ugly imho, e.g. Pytorch spawns multiple python processes and then pushes data between the processes through shared memory, which incurs quite some overhead. Tensorflow on the other hand requires you to stick to their Tensor-dsl so that it can run within their graph engine. If native concurrency were a thing, data loading would be much more straightforward to implement without such hacks.