Hacker News new | ask | show | jobs
by freeqaz 1180 days ago
Taking a paper and turning it into working production code is a non-trivial process, 100%.

Training big models takes a lot of random reads/writes and those tend to be pretty latency sensitive. There _may_ be a way to train this BitTorrent style with donated compute, but it's hard to say how many orders of magnitude slower that would be. (Do you need 2x more compute to do it distributes? 10? 100x?)

It is an interesting question to be able to explore this space more!