Hacker News new | ask | show | jobs
by YetAnotherNick 1086 days ago
bf16 is generally easier to train neural network than fp16 on due to no need for scaling. And most model training and inference performs the same with fp32 and bf16.