Hacker News new | ask | show | jobs
by jauntywundrkind 491 days ago
I'd really like to play with Qwen2.5-VL at some point, perhaps for reading data-sheets for microchips. Nicely for some applications, it's also very good at reporting position of what it finds, which many ML tools are pretty mediocre at. https://qwenlm.github.io/blog/qwen2.5-vl/

Not really this application, but QvQ for visual reasoning is also impressive. https://qwenlm.github.io/blog/qvq-72b-preview/

Meta has used Qwen as the basis for their Apollo research. https://arxiv.org/abs/2412.10360

1 comments

Is Qwen2.5-VL on Ollama? Could give it a try with a few of the schemas we have.

We’ve locally tested with Llama 3.2 11B Vision on Ollama: https://github.com/vlm-run/vlmrun-hub/blob/main/tests/benchm...

FWIW I think Ollama structured outputs API is quite buggy compared to the HF transformers variant.