We looked at different solutions extensively (https://medium.com/towards-data-science/editing-text-in-imag...) and ended up building a tool to solve the problem: https://www.producthunt.com/posts/textify-2
Eventually models will get to the point where they can do this well natively but for now the best we can do is a post-processing step.