Hacker News new | ask | show | jobs
LocateAnything: Fast Vision-Language Grounding with Parallel Box Decoding (research.nvidia.com)
2 points by gmays 7 days ago