| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by goldfeld 1229 days ago
	It's odd that image AIs are not ready to overlay text. If you ask Dall-E or Midjourney also to say a few letters they do lots of nearest random neighbors by not just scrambling the idea of the word but also scribbling anything on top that it thinks looks remotely like writing but is not in any language. Maybe it's still developing the ability to read and maybe secretly creating a completely new script and lang.

1 comments

ebalit 1227 days ago

It's a side effect of the way the text input is represented before being used by the model. It doesn't get the text as a sequence of chars but as a sequence of tokens.

This paper [1] shows that giving character-level awareness to the model can improve the "visual spelling".

1: https://arxiv.org/abs/2212.10562

link