|
|
|
|
|
by andblac
611 days ago
|
|
Skimming through the source it seems to run 'car' and 'person' objects through llava with the following prompt: - "person": "get gender and age of this person in 5 words or less", - "car": "get body type and color of this car in 5 words or less". So YOLO gives the bounding box and rough category, while llava describes the object in more details. |
|