Hacker News new | ask | show | jobs
by mikaraento 531 days ago
I guess text in images would be similar, and is indeed where image generation models struggle to get the details right.

E.g., making a greeting card with somebody's name spelled correctly.