| Changelog: - DXV DXT1 encoder - LEAD MCMP decoder - EVC decoding using external library libxevd - EVC encoding using external library libxeve - QOA decoder and demuxer - aap filter - demuxing, decoding, filtering, encoding, and muxing in the - ffmpeg CLI now all run in parallel - enable gdigrab device to grab a window using the hwnd=HANDLER syntax - IAMF raw demuxer and muxer - D3D12VA hardware accelerated H264, HEVC, VP9, AV1, MPEG-2 and VC1 decoding - tiltandshift filter - qrencode filter and qrencodesrc source - quirc filter - lavu/eval: introduce randomi() function in expressions - VVC decoder (experimental) - fsync filter - Raw Captions with Time (RCWT) closed caption muxer - ffmpeg CLI -bsf option may now be used for input as well as output - ffmpeg CLI options may now be used as -/opt <path>, which is equivalent - to -opt <contents of file <path>> - showinfo bitstream filter - a C11-compliant compiler is now required; note that this requirement - will be bumped to C17 in the near future, so consider updating your - build environment if it lacks C17 support - Change the default bitrate control method from VBR to CQP for QSV encoders. - removed deprecated ffmpeg CLI options -psnr and -map_channel - DVD-Video demuxer, powered by libdvdnav and libdvdread - ffprobe -show_stream_groups option - ffprobe (with -export_side_data film_grain) now prints film grain metadata - AEA muxer - ffmpeg CLI loopback decoders - Support PacketTypeMetadata of PacketType in enhanced flv format - ffplay with hwaccel decoding support (depends on vulkan renderer via libplacebo) - dnn filter libtorch backend - Android content URIs protocol - AOMedia Film Grain Synthesis 1 (AFGS1) - RISC-V optimizations for AAC, FLAC, JPEG-2000, LPC, RV4.0, SVQ, VC1, VP8, and more - Loongarch optimizations for HEVC decoding - Important AArch64 optimizations for HEVC - IAMF support inside MP4/ISOBMFF - Support for HEIF/AVIF still images and tiled still images - Dolby Vision profile 10 support in AV1 - Support for Ambient Viewing Environment metadata in MP4/ISOBMFF - HDR10 metadata passthrough when encoding with libx264, libx265, and libsvtav1 |
> dnn filter libtorch backend
What's ffmpeg's plan regarding ML based filters? When looking through the filter documentation it seems like filters use three different backends: tensorflow, torch, and openvino. Doesn't seem optimal, is there any discussion about consolidating on one backend?
ML filters need model files, and the filters take a path to a model file as one of their arguments. This makes them really difficult to use, if you're lucky you can find a suitable model and download somewhere, otherwise you need to find a separate model training project and dataset and run that first. Are there any plans on streamlining ML filters and model handling for ffmpeg? Maybe a model file repository with an option of installing these in an official models path on the system?
Most image and video research use ML now, but I don't get the impression that ffmpeg tries to integrate the modern technologies well yet. Being able to do for instance spatial and temporal super resolution using standard ffmpeg filters would be a big improvement, and I think things like automatic subtitles using whisper would be a good fit too. But it should start with a coherent ML strategy regarding inference backend and model management.