Hacker News new | ask | show | jobs
by alanbernstein 1229 days ago
This is great. Those examples are not the best quality, but they're impressive.

That prompted me to generate ambigrams with stable diffusion. The results looked odd, as ambigrams tend to, but the "text" was largely illegible. I wonder when the state of the art will be able to handle that request.

2 comments

It's odd that image AIs are not ready to overlay text. If you ask Dall-E or Midjourney also to say a few letters they do lots of nearest random neighbors by not just scrambling the idea of the word but also scribbling anything on top that it thinks looks remotely like writing but is not in any language. Maybe it's still developing the ability to read and maybe secretly creating a completely new script and lang.
It's a side effect of the way the text input is represented before being used by the model. It doesn't get the text as a sequence of chars but as a sequence of tokens.

This paper [1] shows that giving character-level awareness to the model can improve the "visual spelling".

1: https://arxiv.org/abs/2212.10562

love ambigrams and illusions, any description on how one can create their own ? Thanks!
Since the technical nugget is hidden in the code, the fun trick here is to alternate on odd and even steps between moving toward a duck in the latent space image and moving toward a rabbit in the 90-degree rotated version of the latent space image.

(Normally you would feed the output of step n right back in as input to step n+1. That’s what is not happening as usual here.)

ambigrams are cool you can rotate the term you want to ambigram-ize and write it underneath the word and gently merge them together, "column by column" it's nice you ask because I recently saw a "youtube short" about the name Klint that nicely depicts the idea (volume warning) https://www.youtube.com/shorts/3I6rkpAQXmI