Hacker News new | ask | show | jobs
by saberience 1349 days ago
I think you're wrong here.

My partner works in design and her design teams have jumped all in on using Stable Diffusion in their workflows, something that is effectively in "version 1." For concept art especially it is incredibly useful. They can easily generate hundreds to thousands of images per hour and yes, while SD is not great at hands and faces, if you generate hundreds or thousands of images, you get MANY which have perfect hands and faces. Additionally it's possible to chain together Stable Diffusion with other models like GFPGAN and ERSGAN, for up-ressing, fixing faces, etc.

Self driving cars are completely different, no one was using "version 1" of self driving cars within weeks of the software existing. Stable Diffusion and similar models are commercially viable right now and are only getting better in combination with other models and improved training sets.

I think you're shifting the goalposts to what success is here to be quite frank. "The model needs me to be able to specify multiple characters in a scene all performing different actions."

The truth is, if I had to ask art professionals on Fiverr for "beautiful art photography of multiple characters doing different actions", it would be difficult and expensive for them too! And worse, you would get one set of pictures for your money and if you weren't satisfied, you're shit out of luck! On my PC, Stable Diffusion can crank out > 1000 unique pictures per hour until I'm satisfied.

2 comments

> My partner works in design and her design teams have jumped all in on using Stable Diffusion in their workflows, something that is effectively in "version 1." For concept art especially it is incredibly useful.

I do agree if you are coming from the angle of "I need concept art of a surreal alien techbase for a sci-fi movie[0]" then SD&co are super useful. I'm not saying they don't have their uses. But those uses are a lot more limited than people seem to appreciate.

> I think you're shifting the goalposts to what success is here to be quite frank. "The model needs me to be able to specify multiple characters in a scene all performing different actions."

Having multiple, different characters in a picture/scene interacting in some way is not an uncommon, unrealistic requirement.

[0] high res, 4k, 8k frostbite engine, by greg rutkowski, by artgerm, incredibly detailed, masterpiece.

As far as I can tell, it is possible to draw such a scene by adding in the pieces and using the tools to paper over the boundaries and integrate those elements. It takes much more work than just generation but maybe one fiftieth to one hundredth of the work necessary for classic illustration.
It reminds me of one scene in I, Robot (2004)

https://www.youtube.com/watch?v=KfAHbm7G2R0