| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by orbital-decay 975 days ago

Both DE3 and MJ are essentially toys for single random pictures, unusable in a professional setting. DALL-E in particular has really bad issues with quality, and while it follows the prompt well it also rewrites it so it's barely controllable. Midjourney is RLHF'd to death.

What you want for asset creation is not photorealism, but style and concept transfer, multimodal controllability (text alone is terrible at expressing artistic intent), and tooling. And tooling isn't something that is developed quickly (although there were several rapid breakthroughs in the past, for example ZBrush).

Most of the fancy demos you hear about sound good on paper, but don't really go anywhere. Academia is throwing shit at the wall to see what sticks, this is its purpose, especially when practice is running ahead of theory. It's similar to building airplanes before figuring out aerodynamics (which happened long ago): watching a heavier-than-air thing fly is amazing, until you realize it's not very practical in the current form, or might even kill its brave inventor who tried to fly it.

If you look at the field closely, most of the progress in visual generative tooling happens in the open source community; people are trying to figure out what works in real use and what doesn't. Little is being done in big houses, at least publicly and for now, as they're more interested in a DC-3 than a Caproni Ca.60. The change is really incremental and gradual, similarly to the current mature state of 3D. Paradigms are different but they are both highly technical and depend on academic progress. Once it matures, it's going to become another skill-demanding field.

1 comments

kranke155 975 days ago

With respect, I disagree with almost everything you said.

The idea that somehow “AI isn’t art directable” is one I keep hearing, but I remain unconvinced this is somehow an unsolvable problem.

The idea that AIgen is unusable at the moment for professional work doesn’t hold up to my experience since I now regularly use Photoshop’s gen feature.

link

orbital-decay 975 days ago

Photoshop combined with Firefly is exactly the rare kind of good tooling I'm talking about. In/outpainting was found to be working for creatives in practice, and got added to Photoshop.

>The idea that somehow “AI isn’t art directable” is one I keep hearing, but I remain unconvinced this is somehow an unsolvable problem.

That's not my point. AI can be perfectly directable and usable, just not in the specific form DE3/MJ do it. Text prompts alone don't have enough semantic capacity to guide it for useful purposes, and the tools they have (img2img, basic in/outpainting) aren't enough for production.

In contrast, Stable Diffusion has a myriad of non-textual tools around it right now - style/concept/object transfer of all sorts, live painting, skeleton-based character posing, neural rendering, conceptual sliders that can be created at will, lighting control, video rotoscoping, etc. And plugins for existing digital painting and 3D software leveraging all this witchcraft.

All this is extremely experimental and janky right now. It will be figured out in the upcoming years, though. (if only community's brains weren't deep fried by porn...) This is exactly the sort of tooling the industry needs to get shit done.

link

kranke155 974 days ago

Ah ok yes I agree. How many years is really the million dollar question. I’ve begun to act as if it’s around 5 years and sometimes I think I’m being too conservative.

link

ChatGTP 975 days ago

You can remain unconvinced but it's somewhat true.

I can keep writing prompts for DE3 or similar until it gives me something like what I want, but the problem is, there are often subtle but important mistakes in many images that are generated.

I think it's really good at portraits of people, but for anything requiring complex lighting, representation of real world situations or events, I don't think it's ready yet, unless we're ready to just write prompts, click buttons and just accept what we receive in return.

link

kranke155 974 days ago

It’s absolutely not ready yet for sure.

Midjourney already has tools that allow you to select parts of the image to regenerate with new prompts, Photoshop-style. The tools are being built, even if a bit slowly, to make these things useful.

I could totally see creating Matte paintings through Midjourney for indie filmmaking soon, and for tiny budget films using a video generative tool to make let’s say zombies in the distance seems within reach now or very soon. Slowly for some kind of VFX I think AI will start being able to replace the human element.

link