it seems like the value is that you don't need another tool to composite the text. especially for users who aren't aware of figma/photoshop nor how to use them (many many many people)
And if you want the text to faithfully follow the surface of the object (ex tattoos) I don't think the post AI gen manual editing approach is going to be so straightforward.