Hacker News new | ask | show | jobs
by ijpsud 2192 days ago
I was playing with tf.js last year and there were some core components still being converted to WASM, which gave me some big performance boosts once I switched. I had some problems with WebGL-related bugs too. It seems that they're currently in the process of writing WebGPU-based back-ends. I'm hoping that when WebGPU is on-by-default in most browsers, and there's been a year or two to sort out all the major teething problems, we'll see some wide-spread uptake.

Internet speeds are still increasing exponentially, so hopefully the model size problems will become less of an issue - perhaps aided by some CDN-served (and thus cached across domains) "base" models that are fine-tuned with some parameters downloaded from the server. I think I could start playing with it seriously if I could get the model sizes under 30mb or so. In a few years (with increasing internet speeds) that might be 50mb. I think huggingface's distilled GPT-2 model is a couple of hundred megabytes, for reference, so we're certainly not going to be doing anything revolutionary in the browser, but I have a bunch of neat little ideas that I think would be useful.

For bringing the "big" stuff to the web we're probably stuck with APIs, like OpenAI's new GPT-3 offering. I think access to SOTA models on hard problems is going to be almost exclusively via APIs for the foreseeable future.