Y
Hacker News
new
|
ask
|
show
|
jobs
by
famouswaffles
980 days ago
Oh wow. This seems to be the best released vlm model. The chart/UI understanding displayed in particular is superb.
1 comments
GaggiX
980 days ago
>This is by far the best open source vlm model
LLaVA 1.5 is very good, at least at describing images.
http://llava.hliu.cc/
link
axiom92
980 days ago
Right, but no separate image encoder + half the size could be very helpful for many applications.
link
GaggiX
980 days ago
The 7B LLaVa model is smaller, even considering the image encoder (CLIP-L).
link
LLaVA 1.5 is very good, at least at describing images. http://llava.hliu.cc/