| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sorenjan 809 days ago

First of, ffmpeg is amazing, I'm very thankful to everyone involved in it.

> dnn filter libtorch backend

What's ffmpeg's plan regarding ML based filters? When looking through the filter documentation it seems like filters use three different backends: tensorflow, torch, and openvino. Doesn't seem optimal, is there any discussion about consolidating on one backend?

ML filters need model files, and the filters take a path to a model file as one of their arguments. This makes them really difficult to use, if you're lucky you can find a suitable model and download somewhere, otherwise you need to find a separate model training project and dataset and run that first. Are there any plans on streamlining ML filters and model handling for ffmpeg? Maybe a model file repository with an option of installing these in an official models path on the system?

Most image and video research use ML now, but I don't get the impression that ffmpeg tries to integrate the modern technologies well yet. Being able to do for instance spatial and temporal super resolution using standard ffmpeg filters would be a big improvement, and I think things like automatic subtitles using whisper would be a good fit too. But it should start with a coherent ML strategy regarding inference backend and model management.