|
|
|
|
|
by featureofone
215 days ago
|
|
The SAM models are great. I used the latest version when building VideoVanish ( https://github.com/calledit/VideoVanish ) a video-editing GUI for removing or making objects vanish from videos. That used SAM 2, and in my experience SAM 2 was more or less perfect—I didn’t really see the need for a SAM 3. Maybe it could have been better at segmenting without input. But the new text prompt input seams nice; much easier to automate stuff using text input. |
|
I've been considering building something similar but focused on static stuff like watermarks so just single masks. From that diffueraser page it seems performance is brutally slow with less than 1 fps on 720p.
For watermarks you can use ffmpeg blur which will of course be super fast and looks good on certain kinds of content that are mostly uniform like a sky but terrible and very obvious for most backgrounds. I've gotten really good results with videos shot with static cameras generating a single inpainted frame and then just using that as the "cover" cropped and blurred over the watermark or any object really. Even better results with completely stabilizing the video and balancing the color if it is changing slightly over time. This of course only works if nothing moving intersects with the removed target or if the camera is moving then you need every frame inpainted.
Thus far all full video inpainting like this has been so slow as to not be practically useful for example to casually remove watermarks on videos measured in tens of minutes instead of seconds where i would really want processing to be close to realtime. I've wondered what knobs can be turned if any to sacrifice quality in order to boost performance. My main ideas are to try to automate detecting and applying that single frame technique to as much of the video as possible and then separately process all the other chunks with diffusion scaling to some really small size like 240p and then use ai based upscaling on those chunks which seems to be fairly fast these days compared to diffusion.