Hacker News new | ask | show | jobs
by xenova 796 days ago
We’ve put out a ton of demos that use much smaller models (10-60 MB), including:

- (44MB) In-browser background removal: https://huggingface.co/spaces/Xenova/remove-background-web. (We also put out a WebGPU version: https://huggingface.co/spaces/Xenova/remove-background-webgp...).

- (51MB) Whisper Web for automatic speech recognition: https://huggingface.co/spaces/Xenova/whisper-web (just select the quantized version in settings).

- (28MB) Depth Anything Web for monocular depth estimation: https://huggingface.co/spaces/Xenova/depth-anything-web

- (14MB) Segment Anything Web for image segmentation: https://huggingface.co/spaces/Xenova/segment-anything-web

- (20MB) Doodle Dash, an ML-powered sketch detection game: https://huggingface.co/spaces/Xenova/doodle-dash

… and many many more! Check out the Transformers.js demos collection for some others: https://huggingface.co/collections/Xenova/transformersjs-dem....

Models are cached on a per-domain basis (using the Web Cache API), meaning you don’t need to re-download the model on every page load. If you would like to persist the model across domains, you can create browser extensions with the library! :)

As for your last point, there are efforts underway, but nothing I can speak about yet!

2 comments

Why is only one of them on WebGPU? Is it because there additional tricky steps required to make a model work on WebGPU, or is there a limitation on what ops are supported there?

I'm keen to do more stuff with WebGPU, so very interested to learn about challenges and limitations here.

We have some other WebGPU demos, including:

- WebGPU embedding benchmark: https://huggingface.co/spaces/Xenova/webgpu-embedding-benchm...

- Real-time object detection: https://huggingface.co/spaces/Xenova/webgpu-video-object-det...

- Real-time background removal: https://huggingface.co/spaces/Xenova/webgpu-video-background...

- WebGPU depth estimation: https://huggingface.co/spaces/Xenova/webgpu-depth-anything

- Image background removal: https://huggingface.co/spaces/Xenova/remove-background-webgp...

You can follow the progress for full WebGPU support in the v3 development branch (https://github.com/xenova/transformers.js/pull/545).

To answer your question, while there are certain ops missing, the main limitation at the moment is for models with decoders... which are not very fast (yet) due to inefficient buffer reuse and many redundant copies between CPU and GPU. We're working closely with the ORT team to fix these issues though!

Thank you for the reply. Seems like all of the links are down at the moment, but it does sound a bit more feasible for some applications than I had assumed.

Really glad to hear the last part. Some of the new capabilities seem fundamental enough that they ought to be in browsers, in my opinion.

Odd, the links seem to work for me. What error do you see? Can you try on a different network (e.g., mobile)?
Error is "xenova-segment-anything-web.static.hf.space unexpectedly closed the connection."

Works on mobile network, though, so might just be my internet connection.