Hacker News new | ask | show | jobs
by ShaShekhar 723 days ago
Right now, the inpainting is done on semantic mask (output from segmentation model). For more complex instruction, we also have to support contextual mask generation, which is an active area of research in the field of Visual Language Model. When it comes to perform several iteration, you can also do that on semantic level or get a batch of output. The sdv1.5 inpainting model is quite weak and we haven't seen any large scale open source inpainting model for a while.