| HN Mirror

A significant chunk of what you're describing can be solved by a combination of better prompt engineering and repeated inpainting.

SD obviously doesn't understand language in the same way we do, so it can be tricky to describe things in a way that will match your expectations. Once you start to understand the tricks here, it gets easier and easier.

Inpainting will let you fix a lot of the rest. Staircase stops? Select the area where it stopped, get the AI to generate more. People are already doing this to create very complex artwork where there are issues with faces, hands, etc. https://www.reddit.com/r/StableDiffusion/comments/x9u8qh/img... is a great example of how you can quickly iterate over a scene.

One of the other things people struggle with is consistent characters and settings, but people have found ways to improve this with Midjourney - https://docs.google.com/document/u/1/d/e/2PACX-1vRahIr3-h_V3...

There's more of a learning curve to these tools than most people think, but it's also still miles and miles away from the learning curve required to actually be proficient at the technical aspects of making art.