Y
Hacker News
new
|
ask
|
show
|
jobs
by
rdedev
903 days ago
Ah my bad. I am using mixed precision training in the my previous comment.
You might find this paper interesting:
https://arxiv.org/pdf/2010.06192.pdf