Hacker News new | ask | show | jobs
by trust07007707 2443 days ago
Well, the memory limit of a WebGPU process would be the limiting factor for training. In addition, the bandwidth between the nodes and the parameter server, if doing training in data-parallel fashion, is another limiting factor.