Hacker News new | ask | show | jobs
by unshavedyak 1015 days ago
Well... damn. Is there a framework like this (or this directly?) which can run object detection? People, car types, makes, animals, etc?
1 comments

Yes, GroundingDINO is an open set object detector. There are some others (eg DETIC and OWL-ViT) as well.

We’ve been working on using them (often in conjunction with SAM) for auto-labeling datasets to train smaller faster models that can run in real-time at the edge: https://github.com/autodistill/autodistill

Would this be suitable for labeling images to search by keyword (think Apple Photos-like “car” searches to pull up photos of cars)
I think you would want to use something like CLIP embeddings for image search.

Really enjoyed using this app for iOS: https://github.com/mazzzystar/Queryable HN discussion: https://news.ycombinator.com/item?id=34686947

Or explore the dataset stable diffusion was trained on: https://news.ycombinator.com/item?id=32655497