ML turns video of a 360° turn into 3D model of a person

Y	Hacker News new \| ask \| show \| jobs

	ML turns video of a 360° turn into 3D model of a person (sciencemag.org)
	96 points by mikeyanderson 2988 days ago

10 comments

symisc_devel 2988 days ago

Link to the paper: https://arxiv.org/abs/1803.04758

link

neonate 2988 days ago

And to the video: https://www.youtube.com/watch?v=nPOawky2eNk

link

llao 2988 days ago

Oh how I hate marketing speech.

First of all, the title should include "video of a predefined 360° turn".

And then they say something along the lines of "average accuracy of about 5mm" for joining the constructed modeled joints to their model, while you see the body wobbling around happily.

This is an impressive demo, but gah!

link

dang 2988 days ago

Ok, we'll give it a 360° turn above.

link

nitrogen 2988 days ago

Structure from motion is an existing technique. What is the contribution of ML in this case (it seems like joint positioning maybe?)?

https://en.m.wikipedia.org/wiki/Structure_from_motion

link

ansgri 2988 days ago

99%¹ of computer vision problems are 80% solved. The problem is, you need 95+% solution to be practically useful.

Binocular stereo vision has just approached general applicability, and SfM is mostly used in very constrained environments (traffic analysis) or with large computational resources with manual correction (offline 3D mapping from aerial data).

¹ Numbers are metaphoric only, based on experience in scientific and industrial CV.

link

fsloth 2988 days ago

SFM does not automatically provide joint locations. Also, a casual 360 video around a subject does not provide enough data for producing a full body mesh.

link

raghavkhanna 2988 days ago

How is this ML? They use a CNN for foreground segmentation, a minor step in their pipeline. But the major contribution seems to be putting the silhouettes in a common reference frame. I sincerely hope sciencemag isn’t putting ML in the title purely to jump on the bandwagon.

link

eclee2 2988 days ago

What a farce. The use of the ML for background subtraction is almost inconsequential to the contribution of the paper and the result.

link

utkarshsinha 2988 days ago

It's someone standing in front of a green screen. You don't need ML to find a person's silhouette.

link

seandougall 2988 days ago

To be fair, they do have examples that aren’t chroma keyed; they just lead with one that is.

Which is not to say that ML is necessary for this sort of computer vision task, but I wonder if it yields better or sharper results than other techniques?

link

extralego 2987 days ago

Same. As someone who has spent an embarrassing amount of time keying and tracking video footage over the years, I’m surprised ML isn’t being used for this more often in studios by now.

link

egypturnash 2988 days ago

As an artist, my first thought is I wonder what happens if you try giving this a series of drawings.

link

make3 2988 days ago

you'd probably need a lot of drawings, I wonder what's the sampling rate the thing uses

it's a cool idea though :)

link

seandougall 2988 days ago

They say “standard” video is the source, so it would likely be on the order of 30 or 60 fps. Seems to be around a couple hundred frames, give or take, though I suspect it could get _something_ out of fewer frames, and more would just incrementally improve the model.

I would expect minor textural differences in a hand-drawn or painted source would make it a lot harder to correlate points between frames, but it’s an interesting idea to think about!

link

mtgx 2988 days ago

This is what should give you pause before using face authentication technology for anything.

link

haZard_OS 2988 days ago

Can you elaborate?

link

toomuchtodo 2988 days ago

Makes forging facial biometrics easier.

link

seandougall 2988 days ago

In the case of Face ID, at least, you’d still have to transfer the measurements into the physical world, in a way that fools a system that has ostensibly been designed not to be fooled by masks.

link

toomuchtodo 2988 days ago

Like a 3D printed model?

link

URSpider94 2987 days ago

Doesn’t work for high quality face reco systems like iPhone X. You’d also need to get the IR reflectance, as well as a sign of life from the eyes.

link

make3 2988 days ago

I wonder if will see a future soon where a director can fully edit the positions and physical actions of the actors at post production.

basically, the whole scenes will be transferred to believable 3d models seemlessly, and you can reanimate parts of everything. I feel like that's doing to happen for sure, for big Hollywood productions at least (like the Marvel stuff)

link

leohutson 2987 days ago

This already happens a lot, most VFX heavy productions will have digital doubles of the main cast, and they can be used for as simple a reason as reframing a shot.

link

extralego 2987 days ago

Your comment could give the impression this is drastically more simple to do than it is in reality. This is considered as something like the last frontier of VFX, and there still remains a lot of work to be done.

While you’re essentially correct, it is currently an overwhelmingly manual process. The amount of work and time necessary is substantial (some would say outrageous), and exponentially higher for certain types of shots. Many shots remain impossible or cost-defeating.

link

interfixus 2988 days ago

It seems determined to put visible toes on everybody, no matter that they're wearing socks.

Is this a bug or a feature?

link

RodgerTheGreat 2988 days ago

I'm going to guess they start with a generic human model that includes all limbs and extremities and then the "machine learning" process attempts to fit that model to the silhouettes extracted from the video.

link

stochastic_monk 2988 days ago

Which implies that the technique uses domain knowledge of people to make assumptions about their morphology.

link

codetrotter 2988 days ago

This is awesome. I wish someone will implement this as a piece of open source software. Imagine the potential!

link

raghavkhanna 2988 days ago

Source code seems to be available :)

https://graphics.tu-bs.de/people-snapshot

link

bahmboo 2988 days ago

From site: "We will provide access to the code and dataset soon."

link

meric 2987 days ago

Could be used for VR phone calls between long distance couples.

link