|
|
|
|
|
by tfsh
384 days ago
|
|
Perhaps you missed the associated documentation? This is a classification tool which requires input labels "uses an EfficientNet architecture and was trained using ImageNet to recognize 1,000 classes, such as trees, animals, food, vehicles". The full list [1] doesn't seem to include a human. You can tweak the score threshold to reduce false positives. 1: https://storage.googleapis.com/mediapipe-tasks/image_classif... |
|
Did you also try on items from the list ?
If there is a match (and this is not frequent), to me it's still very low confidence (like noise or luck).
It seems to be a repacking of https://blog.tensorflow.org/2020/03/higher-accuracy-on-visio...
So an old release from 5 years ago (like very long time in AI-world), and AFAIK it has been superseded by YOLO-NAS and other models. MediaPipe feels really old tool, except for some specific subtasks like face tracking.
And as a side-note, the OKR-system at Google is a very serious thing, there are lot of people internally gaming the system, and that could explain why it is a "new" launch, instead of a rather disappointing rebrand of the 2020-version.
I'd rather recommend building on more modern tools, such as: https://huggingface.co/spaces/HuggingFaceTB/SmolVLM-256M-Ins... (runs on iPhone with < 1GB of memory)