Hacker News new | ask | show | jobs
by simonw 355 days ago
There are some spectacular local models for generating text descriptions of images now. I suggest starting with Mistral Small 3.2, Gemma 3 and Qwen 2.5VL - all available via Ollama.

I expect we will see a Qwen 3VL soon.