Hacker News new | ask | show | jobs
by pandaforce 87 days ago
The main target for this are NLEs like Blender. Performance is a large part of the issue. Most users still just create TIFF files per frame before importing them into a "real editor" like Resolve. Apple may have ASICs for ProRes decoding, and Resolve may be the standard editor that everyone uses.

But this goes beyond what even Apple has, by making it possible to work directly with compressed lossless video on consumer GPUs. You can get hundreds of FPS encoding or decoding 4k 16-bit FFv1 on a 4080, while only reading a few gigabits of video per second, rather than tens and even hundreds of gigabits that SSDs can't keep up. No need to have image degradation when passing intermediate copies between CG programs and editing either.

3 comments

Yep! Almost finished implementing support in https://ossia.io which is going to become the first open-source cross-platform real-time visuals software to support live scrubbing for VJ use cases, in 4K+ prores files on not that big of a GPU (tested on my laptop 3060) :)
How to feed MilkDrop music visualizations?

(MilkDrop3, projectm-visualizer/presets-cream-of-the-crop, westurner/vizscan for photosensitive epilepsy)

mapmapteam/mapmap does open source multi-projector mapping. How to integrate e.g. mapmap?

BespokeSynth is a C++ and JUCE based patch bay software modular synth with a "node-based UI" and VST3, LV2, AudioUnit audio plugin support. How to feed BespokeSynth audio and possibly someday video? Pipewire and e.g. Helvum?

- MilkDrop: I'd love a PR that adds support for ProjectM :D it would be fairly easy to make a custom plug-in that just blits the texture.

Basic code for this would look like that:

    struct MilkdropIntegration
    {
      halp_meta(name, "ProjectM")
      halp_meta(c_name, "projectm")
      halp_meta(category, "Visuals")
      halp_meta(author, "ProjectM authors")
      halp_meta(description, " :) ")
      halp_meta(uuid, "417534da-3625-404a-b74f-91d003cb64b9")
    
      // By know you know the drill: define inputs, outputs...
      struct
      {    
        struct : halp::lineedit<"Program", "">
        {
          halp_meta(language, "eel2")
        } program;
      } inputs;
    
      struct
      {
        struct
        {
          halp_meta(name, "Out");
          halp::rgba_texture texture;
        } image;
      } outputs;
    
      halp::rgba_texture::uninitialized_bytes bytes;
    
      void operator()()
      {
        if(bytes.empty())    
          bytes = halp::rgba_texture::allocate(800, 600); // or whatever resolution you wanna set
          
        // Fill in bytes with your custom pixel data here
        
        outputs.image.texture.update(bytes.data(), 800, 600);
      }
    };
inside such a template: https://github.com/ossia-templates/score-avnd-simple-templat...

- multi-projector mapping: ossia actually does it directly! it's in git master, will be released in the next version. It also supports a fair amount of features that MapMap does not have such as:

* soft-edge blending

* blend modes

* custom polygons

* a proper HDR passthrough as well as tonemapping, etc.

* Metal, Vulkan, D3D11/12 support (mapmap is opengl-only)

* Spout, Syphon, NDI, soon pipewire video. Mapmap only supports camera input.

* HAP and DXV, both decoded on GPU.

* Smooth grid distortion. Here's mapmap grid distortion: https://streamable.com/1nhwxg vs ossia with sufficiently high subdivisions: https://streamable.com/hmb1jm

* And of course as mentioned here hw decoding (for some years already), the new feature adds zero-copy when for instance using vulkan video and the vulkan GPU backend.

* In addition pretty much every YUV pixel format in existence is GPU-decoded (https://github.com/ossia/score/tree/master/src/plugins/score...).

In contrast Mapmap does gstreamer -> Qt ; everything including the Yuv -> RGBA conversion goes through the CPU.

- How to feed BespokeSynth audio and possibly someday video? Pipewire and e.g. Helvum?

yes, pipewire (or jack or blackhole on windows and macOS). Although ossia also supports, VST, VST3, LV2, CLAP, JSFX, and Faust and comes with many audio effects built-in already.

I don’t understand the spread of thoughts in your post.

The reason to create image sequences is not because you need to send it to other apps, it’s because you preserve quality and safeguard from crashes.

A crash mid video write out can corrupt a lengthy render. With image sequences you only lose the current frame.

People aren’t going to stop using image sequences even if they stayed in the same app.

And I’m not sure why this applies: “this goes beyond” what Apple has, because they do have hardware support for decoding several compressed codecs (also I’ll note that ProRes is also compressed). Other than streaming, when are you going to need that kind of encode performance? Or what other codecs are you expecting will suddenly pop up by not requiring ASICs?

Also how does this remove degradation when going between apps? Are you envisioning this enables Blender to stream to an NLE without first writing a file to disk?

> A crash mid video write out can corrupt a lengthy render. With image sequences you only lose the current frame.

You wouldn't contain FFv1 in MP4, the only format incompetent enough for such corruption.

Apple has an interest against people using codecs that they get no fees from. And Apple don't have a lossless codec. So they don't offer lossless compressed video acceleration.

The idea is that when working as a part of a team, and you get handed a CG render, you can avoid sending a huge .tar or .zip file full of TIFF which you then decompress, or ProRes which loses quality, particularly when in a linear colorspace like ACEScg.

I’m curious what kind of teams you’re working in that you’re handing compressed archives of image sequences? And using tiff vs EXR (unless you mean purely after compositing)?

Another reason to use image sequences is that it’s easier to re-render just a portion of the sequence easily. Granted this can be done with video too, but has higher overhead.

But even then why does the GPU encoding change the fact that you’d send it to another NLE? I just feel like there are a lots of jump in thought process here.

I thought an industry standard was to use proxy files. Open source editor Shotcut use them for example. Create a low resolution + intra-frame only version of the file for very fast scrubbing, make your edits on that, and when done the edit list is applied to the full resolution rushes to produce the output.
Often but not always. Sometimes you’re just working with proxies directly, audio mixing and the like. VFX workflows, finishing will be online full res often.

But even so everybody is often making their own proxies all the time. There’s a lot of passing around of ProRes Proxy or another intermediate quality format and you still make even lighter proxies locally so NLEs and workstation apps will still benefit from this

Proxy files have issues when doing coloring, greenscreens, effects shots. The bit depth, chroma resolution, primaries/transfer/colorspace gets changed. Basically only really usable when editing. With this, you don't need proxy files at all.