|
|
|
|
|
by mikewarot
300 days ago
|
|
Because it never sees raw ASCII or Unicode during training. Everything in their input is tokenized. Asking it to count is like asking a person born blind to paint and complaining they didn't get the colors quite right. You could train an AI on ASCII or Unicode, but it would likely take 100 times the compute resources for similar performance on everything else. Tokenized input is really efficient. |
|
I wonder if that could be useful, to make AI-resistant CAPTCHA's...