|
|
|
|
|
by warangal
879 days ago
|
|
It is quite possible B variant is not enough for some scenarios, earlier version also included the videos search, frames used for indexing were sometimes blur (not having fine-details) and these frames generally would have higher score for naive Natural language queries. I only tested with B variant. But i resolved that problem upto a point by adding a Linear layer trained to discard such frames, and it was less costly than running a bigger variant for my use case. |
|