Hacker News new | ask | show | jobs
by icyfox 76 days ago
As far as I've seen, local OSS video understanding models just really aren't there yet. I briefly looked at facial recognition models but a good amount of signal was actually in the video's audio instead of the raw video frames. Depends on the accuracy you're looking for at the end of the day.
1 comments

Thanks for the reply. Let's hope local models catch up.