Hacker News new | ask | show | jobs
by rikroots 694 days ago
After playing with the SAM2 demo for far too long, my immediate thought was: this would be brilliant for things like (accessible, responsive) interactive videos. I've coded up such a thing before[1] but that uses hardcoded data to track the position of the geese, and a filter to identify the swans. When I loaded that raw video into the SAM2 demo it had very little problem tracking the various birds - which would make building the interactivity on top of it very easy, I think.

Sadly my knowledge of how to make use of these models is limited to what I learned playing with some (very ancient) MediaPipe and Tensorflow models. Those models provided some WASM code to run the model in the browser and I was able to find the data from that to pipe it though to my canvas effects[2]. I'd love to get something similar working with SAM2!

[1] - https://scrawl-v8.rikweb.org.uk/demo/canvas-027.html

[2] - https://scrawl-v8.rikweb.org.uk/demo/mediapipe-003.html