Hacker News new | ask | show | jobs
by bmiranda 3322 days ago
Google's hardware is for inference, not training.
3 comments

Volta is for both inferencing and training, but has an emphasis on inferencing
thanks for clarifying.
It doesn't matter, operations are the same in forward and backward mode.

"Made for inference" just means "too slow for training" if you are pessimistic or "optimized for power efficiency" if you are optimistic.

Otherwise training and inference are basically the same

You can do inference pretty easily with 8-bit fixed point weights. Now attempt doing the same during training.

Training and inference are only similar at a high level, not in actual application.

... because the gradient that is being followed may have a lower magnitude than can be represented in the lower precision.
You also need a few other operations for training, such as transpose, which may or may not be fast in a particular implementation.

(ETA: In case it's not obvious, I'm agreeing with david-gpu's comment, and adding more reasons that training currently differs from inference.)