| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mikewarot 300 days ago

Because it never sees raw ASCII or Unicode during training.

Everything in their input is tokenized. Asking it to count is like asking a person born blind to paint and complaining they didn't get the colors quite right.

You could train an AI on ASCII or Unicode, but it would likely take 100 times the compute resources for similar performance on everything else. Tokenized input is really efficient.

1 comments

bell-cot 300 days ago

So they're also complete crap with old-fashioned ASCII art?

I wonder if that could be useful, to make AI-resistant CAPTCHA's...

link