Hacker News new | ask | show | jobs
by wruza 550 days ago
I can share my insights of going from 0 to “can do” in the last year.

ideally I'd like something that explains _how_ diffusion models work, their probabilistic nature, how they're trained on images/text

How they work is probably out of scope of an artist. They’ll figure it out given some knobs, and honestly you cannot explain any of it, you have to get the feeling. Cause it’s not only probabilistic, but island-y. What worked yesterday may noy work today. You have to train yourself on these parameters.

Comfy

Is basically a bash-like visual programming. Cool for developers and repeatable workflows, but overkill and overwhelming for a beginner.

Problems.

The most problems come from python management. Python versions, libraries, plugins breaking the env, etc. you have to prepare some pre-built folders (or cloud images) for them. So that it just works. Then I’d download a few popular checkpoints andshow them the inpaint (mask) tab in a1111. I think that this tab will create a good first impression required for further interest.

As a sort of a poor artist myself, it was amazing to see how SD can take two sprites and inpaint them together across a masked edge. Or replace a segment with a completely new content. (Although for them it may look meh)

Txt2img is not that wow, cause everyone is already tired of it.

Can’t point to resources, sorry, I learned from the internet by googling all my questions.