Hacker News new | ask | show | jobs
by why_only_15 1495 days ago
The ANE only has support for calculations with fp16, int16 and int8 all of which are too small to train with (too much instability). A common thing to do is train in fp32 to be able to get the small differences and gradients and then once the model is frozen do inference on fp16 or bf16.
1 comments

Using mixed precision training you can do most operations in fp16 and just a few in fp32 where it's needed. This is the norm for NVIDIA GPU training nowadays. For instance using fastai add `.to_fp16()` after your learner call, and that happens automatically.
How is the choice between fp16 and fp32 made? Is it like if any gradients in the tensor need the extra range you use fp32?
This article [0] from Nvidia gives a good overview of how mixed precision training works.

Super high level (from section 3):

  1. Converting the model to use the float16 data type where possible.
  2. Keeping float32 master weights to accumulate per-iteration weight updates.
  3. Using loss scaling to preserve small gradient values.
[0] https://docs.nvidia.com/deeplearning/performance/mixed-preci...
The PyTorch docs give a pretty good overview of AMP here https://pytorch.org/tutorials/recipes/recipes/amp_recipe.htm... and an overview of which operations cast to which dtype can be found here https://pytorch.org/docs/stable/amp.html#autocast-op-referen....

Edit: Fixed second link.