| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by andblac 659 days ago

Skimming through the source it seems to run 'car' and 'person' objects through llava with the following prompt:

- "person": "get gender and age of this person in 5 words or less",

- "car": "get body type and color of this car in 5 words or less".

So YOLO gives the bounding box and rough category, while llava describes the object in more details.