Hacker News new | ask | show | jobs
Stable Diffusion to 3D/WebVR, in the cloud, available now (holovolo.tv)
57 points by fbriggs 1331 days ago
9 comments

"A CYBERPUNK NINJA RIDING AN OSTRICH THROUGH THE STREET OF TOKYO" https://holovolo.tv/v/962583

Today Lifecast unveils text-to-full 3D immersive environments that can be viewed in VR (e.g., Quest 2) or on 2D screens. We are doing this with a combination of Stable Diffusion and several other neural nets to make it 3D, combined with Lifecast's format for 6DOF VR photos and video. It's free to try and we do the processing in the cloud. Check it out and tell us what you think! This is version 1.0 and we are iterating quickly, so expect improvements in the future.

What’s stopping you from offering a mobile stereoscopic view? There’s likely more Google Cardboard users out there than active Horizons users at this point.
This looks... pretty terrible. The images being generated are fine, but the conversion from 2D to 3D is awful. It looks like something poorly lasso-tool'd around the subject, put it on another layer closer to the viewer, and then very poorly interpolated the space that's visible between the two layers when you look at it from an angle.

Am I missing something? I feel like I've seen much better automatic 2D->3D conversions via layering long before this.

haha, sounds like you're describing every "3d" movie that came out during the attempted 3D TV revolution in the 2010s

The site looks cool to me, I think we're being a little uncharitable to it. It runs at a high framerate and pans around smoothly. If someone or a few people made this in their spare time as a cool demo, it's great IMO

If this is the result of $50,000,000's worth of research and development, maybe it's worth a little scorn

Given how 2D looked even a few years ago I’ve got high hopes for this
This is just an off the shelf img2depth model run on top of stable diffusion - I don’t think there’s a novel model or research behind this. People have been doing the same thing in colab for a while.
i dont mean this singular method in particular, just the ability to run automatic conversions to 3d or generating 3d assets without needing to hire modelers
What I find strange is that it fills the missing details with completely unrelated images. As an example, this "An astronaut meeting the president" uses a layer of grass and trees to fill the missing scenery on Mars.

https://holovolo.tv/v/874a1a

It's a tech demo. Six months from now a prompt will be generating interactive environments.
Maybe they accidentally trained the AI to fuck up the VR180 camera projection?

The left/right sides of every image contain a different, scaled and rotated image. Some of the discontinuities are visually pretty interesting.

I think this is just created by running a depth prediction model on the output of stable diffusion and then inserting the relevant mesh into a 3D scene. The output of stable diffusion isn’t seamless by default, so those jumps will happen.
Im guess im not able to view the effect on desktop? Is it some kind of depth segmentation of the generated images rather than actual 3d? Maybe I need to view in a VR headset?
Theis link from the OP seems to work well on desktop: https://holovolo.tv/v/962583

That said, it looks like the ninja on the ostrich is a paper cutout that just has different parallax scrolling and you can still see the hole in the background it was cut out off.

It's weird how such an obvious marketing plug with subpar results of 3D projection of 2D images got any traction here. I was baited into clicking by seeing 3D/WebVR and was expecting 3D shapes like the recent advancements, and saw.. well that
Looks great. Given all the progress around this on the open source side I'm hoping that soon we'll be able to run something like this at home.
I didn’t believe John Carmack when he said we could generate entire VR worlds by using footage from TV shows.

I do now.

I’m curious to see which researcher will be the first to have an AI generate realistic 3D assets by understanding 3D natively.
That's already here with NeRF
Can NeRF generate it’s own output (say from from a text prompt)? I thought it used image/video input
Is there a working NeRF Google Colab tho?
see no mention of stable diffusion, how is spam trending on HN 1st page is surprising to me.
Did you click the create button? It takes you to a page where it specifically mentions stable diffusion and allows you to create your own "vr image" with a prompt.