Hacker News new | ask | show | jobs
by fidotron 1035 days ago
I think the fact Google curiously ignored all the security problems raised by the WebGPU API suggests they are closer to trying to offload the GPU inference part of this to end users than people think.

Build as much of the model as you can in the cloud, run inference locally and push results back is probably the cost optimal way to run this stuff at scale.