Hacker News new | ask | show | jobs
by mrkn1 33 days ago
I made something very similar 2 weeks ago, re the upcoming OpenAI phone.

https://news.ycombinator.com/item?id=48040327

2 comments

The image processing is neat. Local model ran in the browser?
thank you! actually it's an API call to a VL model on Deepinfra (model is Qwen3-VL-30B-A3B-Instruct)
This is really neat, and disturbing.
thank you and alas yes, the image understand is the only LLM, the rest has been available on browsers through js since the 2000s