Hacker News new | ask | show | jobs
by famouswaffles 1087 days ago
You're not going to get even close to Midjourney or even Bing quality on SD without finetuning. It's that simple. When you do finetune, it will be restricted to that aesthetic and you won't get the same prompt understanding or adherence.

For all the promise of control and customization SD boasts, Midjourney beats it hands down in sheer quality. There's a reason like 99% of ai art comic creators stick to Midjourney despite the control handicap.

4 comments

Yet you are posting this in a thread where GP provided actual examples of the opposite. Look for another comment above/below, there are MJ-generated samples which are comparable but also less coherent than the result from a much smaller SD model. And in case of MJ hallucinations cannot be fixed. MJ is good but it isn't magic, it just provides quick results with little experience required; prompt understanding is still poor, and will stay poor until it's paired with a good LLM.

Neither of the existing models gives actually passable production-quality results, be it MJ or SD or whatever else. It will be quite some time until they get out of the uncanny valley.

> There's a reason like 99% of ai art comic creators stick to Midjourney

They aren't. MJ is mostly used by people without experience, think a journalist who needs a picture for an article. Which is great and it's what makes them good money.

As a matter of fact (I work with artists), for all the surface-visible hate AI art gets in the artist community, many actual artists are using it more and more to automate certain mundane parts of their job to save time, and this is not MJ or Dall-E.

There's a distinction to be made here. Everything that makes SD a powerful tool is the result of being open source. The actual models are significantly worse than Midjourney. If an MJ level model had the tooling SD does it would produce far better results.
> If an MJ level model had the tooling SD does it would produce far better results

And vice versa, which is the exciting part to me - only a matter of time!

Midjourney output all has the same look to it.

If you’re ok with basic aesthetics it’ll work but if you want something a bit less cringe or that will stand out in marketing it won’t cut it.

It only has the same look if it's not given any style keywords. I've been impressed with the output diversity once it's told what to do. It can handle a wide range of art styles.
Then we need to give style keywords to the other networks too, and suddenly the gap shortens.

Default Midjourney is one thing and that’s mid…

>Yet you are posting this in a thread where GP provided actual examples of the opposite.

Opposite of what ? OP posts results from a tuned model.

Opposite of this:

>For all the promise of control and customization SD boasts, Midjourney beats it hands down in sheer quality.

The results are comparable, but MJ in this comment https://news.ycombinator.com/item?id=36409043 hallucinates more (look at the roofs in the second picture). And it cannot be fixed, maybe except for an upscale making it a bit more coherent. Until MJ obtains better tooling (which it might in the next iteration), it won't be as powerful. I'm not even starting on complex compositions, which it simply cannot do.

>OP posts results from a tuned model.

Yes, which is the first step you should do with SD, as it's a much smaller and less capable model.

If course it's a tuned model. Why would anyone use stock SD these days?
I feel like people shouldn't talk in definitives if their message is just going to demonstrate they have no idea what they're talking about.
I know what i'm talking about lol. I tuned a custom SD model that's downloaded thousands of times a month. I'm speaking from experience more than anything. Don't know why some SD users get so defensive.
You load a model and have 6 sliders instead of one… it’s not exactly “fine tuning”.

If you want the power, it’s there. But nearly bone stock SD in auto1111 is going to get to any of these examples easily.

Show me the civitai equivalent for MJ or Dalle2. It doesn’t exist.

>You load a model and have 6 sliders instead of one… it’s not exactly “fine tuning”.

Ok...? Read what i wrote carefully. Your 6 sliders won't produce better images than midjourney for your prompt on the base SD model.

Midjourney has a riduculously restrictive keyword filter. You should have mentioned that.

Also I see nothing wrong with using different models for different purposes.