|
|
|
|
|
by warangal
1120 days ago
|
|
For what it is worth, we work on a tool[0] to index all local videos and images and later allowing query just using natural language. It is based on CLIP which has been trained on image-text pairs, but seems to work great for videos after applying some naive heuristics. [0] https://github.com/ramanlabs-in/hachi |
|