Hacker News new | ask | show | jobs
by kaoD 640 days ago
But Qwen is not multimodal, or is it?
1 comments

https://qwen2.org/vl/

>Qwen2-VL is the latest addition to the vision-language models in the Qwen series, building upon the capabilities of Qwen-VL. Compared to its predecessor, Qwen2-VL offers:

>State-of-the-Art Image Understanding

>Extended Video Comprehension

Besides, it'd have been pretty silly for them to mention it on their slides if it wasn't.