Hacker News new | ask | show | jobs
by jamesk_au 3431 days ago
"It is our goal to train a trillion-parameter model on a trillion-word corpus. We have not scaled our systems this far as of the writing of this paper, but it should be possible by adding more hardware."