Chucking images at any model that supports image input and asking it to describe specific areas/things 'in extreme detail' is a decent way to get an idea of what its expecting vs what you want.
+1 to this flow. I use the exact same phrase "in extreme detail" as well haha. Additionally, I ask the model to describe what prompt it might write to produce some edit itself.