Hacker News new | ask | show | jobs
by ulber 1574 days ago
I think you misunderstood the approach. MuZero is being used to optimize the choices made in the VP9 compression. In modern video encodings there's many ways to encode the same content. As a very simple example, you can vary how often you provide a full encoding of a frame and how often you encode differences between frames. Once this off-line optimization is done, the result is still a valid VP9 encoding, just a smaller one. MuZero is not needed for decompression at all.
3 comments

The way to think of this is as part of Jeff Dean's "deep-learn all the cloud things!" thesis: https://www.gwern.net/Tool-AI#dean-2017

A cloud stack, from OS kernel settings to TCP/IP to database query optimizers to video codec settings to compiler settings, is made of thousands upon thousands of toggleable options, each of which is usually left at the default because no one on earth understands more than a small fraction of them, much less how to set them all appropriately for each task end-to-end. It's blackboxes on top of blackboxes all the way down. Collectively, inferior options could be giving up an incredible amount of performance. As has been demonstrated by experts in performance tuning, depending on how pessimal the defaults are, you could easily gain orders of magnitude performance by setting them to saner settings, much less truly optimal settings - these sorts of posts turn up routinely on HN, and even in very well-tuned cloud stacks, you have to figure that gains like >10% should be possible.

MuZero here shows that it can work for one piece of the stack. And MuZero is, by design, an insanely general architecture: handles two-player games like chess/Go & handles one-player like ALE, handles continuous action spaces (Sampled-MuZero), reasonably sample-efficient (because it learns an environment model, so using that more is MuZero-Reanalyzed), handles hidden information games against adversaries (Player of Games), and now OP shows self-play in a weird setting. (It still requires problem-specific input layers but even that can be lifted if you're willing to pay for Perceiver inputs which do arbitrary input modalities.)

So you can see the potential here for doing much more of cloud operations (beyond current applications like datacenter cooling control) with DRL agents. Plunk down a MuZero on your entire stack and assign it the goal of optimizing end-to-end for each specific task - DRL is expensive, but cloud-scale is even more so. Needless to say, don't expect any released checkpoints on Github...

Hmm, I don't think I misunderstood? I get that they're using MuZero to decide the bitrate for equivalent perceptive quality as a function of the content. Sure, once they decide on that using MuZero it's a valid compression and the end-user doesn't have to do something extra... But it's super expensive to run that on the server's end, no? So it ends up being a bit like an asymmetric (in terms of client/server) compute/compression tradeoff, right? And you need to run this for each file you want to compress, hence it's "online".
> MuZero is being used to optimize the choices made in the VP9 compression.

I think "parameter optimization" is a better expression here. Optimize can means many things, but certainly it's not optimizing the algorithm/encoding itself. It's all about being smart at the very last mile.