Hacker News new | ask | show | jobs
by sniklaus 2463 days ago
I unfortunately did not get the approval to release the context-aware frame interpolation, but one of my older works is open source for research purposes: https://github.com/sniklaus/pytorch-sepconv

1. I am not sure about the scale of the individual loss functions anymore, my apologies. I determined the combination of the two losses via a simple grid search and seeing what works best (plus / minus a magnitude did not make that much of a difference).

2. The points are just splatted to an image plane, more advanced point cloud rendering techniques would be better though. There is no video frame interpolation, each individual frame in the output video is a rendering of the point cloud from a different camera perspective.

3. I am not sure about multi-view stereo, COLMAP still seems like the state of the art for that. But neural networks definitely outperform classic techniques for single image depth estimation.

4. Common architectures just did not do as well as I was hoping for so I tried about 1500 model architectures. I started with an architecture that intuitively seemed right and then gradually explored / refined alterations of it. It ultimately was a lot of trial and error.

1 comments

Thanks for the reply.

Was it the university that didn't want to release it? Are they looking at commercializing it, or how does that work? Is it available in any commercial software? It kind of looks like magic and would probably be very useful for a lot of purposes.

2. So basically each point is projected to the image plane without perspective mapping? So in 3D, the further away from the camera they are the bigger they are so they all have the same size on the image? And that prevents any seams to occur in the pixel grid as things move around?

4. Experience, intuition, and elbow grease. Kind of what I thought, but I guess it's reassuring to see an expert in the field having to try 1500 variants.

It's complicated but the gist of it is that they are trying to commercialize it, yes. For what it is worth, you might be able to find it in commercial software once they were successful with their business endeavors.

2. Yes and there are two mechanism for handling seams. First, the inpainting which extends the point cloud and can provide a higher sample rate. Second, a postprocessing step that heuristically fills in any seams that may still be present despite inpainting.

4. The downside of it is that one needs a lot of resources in order to try all of these variants, which not everyone is lucky enough to have access to.