I actually haven't but nova from Amazon was surprisingly good at things like bounding boxes compared to some others You kind of have to test and measure so many different aspects to get the best at specific tasks Thanks for the idea
we're currently in the process of doing this. i think something that could potentially work is to iterate upon the initial image composition / structure using cheaper models, and then upscale at the end. this way you're saving on that iteration cost, but eventually land on a higher-scale image.