|
|
|
|
|
by yavorgiv
1015 days ago
|
|
I am the creator of the repo. Depends on the machine, number of threads selected and the model checkpoint used (Vit-B or Vit-L or Vit-B). The video demo attached is running on Apple M2 Ultra and using the Vit-B model. The generation of the image embedding takes ~1.9s there and all the subsequent mask segmentations take ~45ms. However, I am now focusing on improving the inference speed by making better use of ggml and trying out quantization. Once I make some progress in this direction I will compare to other SAM alternatives and benchmark more thoroughly. |
|