Hacker News new | ask | show | jobs
by brianjking 5 days ago
There is a newer BLIP-2, but it's also fairly old. You're better off with many other local models such as Moondream 3 https://huggingface.co/moondream/moondream3-preview.

Moondream is great as it can point, count, perform bounding boxes, descriptions, and visual grounded reasoning.