Hacker News new | ask | show | jobs
by jayd16 2289 days ago
Very cool. Reminds me of when I played with Google's Seurat.

The paper says its 5MB, 12 hours to train the NN and then 30 seconds to render novel views of the scene on an nVidia V100.

Sadly not something you can use in real time but still very cool.

Edit:12 hours and 5MB NN not 5 Minutes

1 comments

Huh, what? It needs almost a million views, and takes 1-2 days to train on a GPU. I’m not sure where the “5 minutes” number comes from.

EDIT: I was referring to the last paragraph of section 5.3 (Implementation details), but maybe I’m misunderstanding how they use rays / sampled coordinates.

Very impressive visual quality. But it seems like they need a LOT of data and computation for each scene. So, its still plausible that intelligently done photogrammetry will beat this approach in efficiency, but a bunch of important details need to be figured out to make that happen.

Excuse me I meant 5MB. It takes 12 hours to train.

>All compared single scene methods take at least 12 hours to train per scene

But it seems to only need sparse images.

>Here, we visualize the set of 100 input views of the synthetic Drums scene randomly captured on a surrounding hemisphere, and we show two novel views rendered from our optimized NeRF representation

> It needs almost a million views

Not sure what you mean by "views". The comparisons in the paper use at most 100 input images per scene.

A pixel is one view for their model if I understand correctly, so one hundred 100x100 images would be a million views.