Hacker News new | ask | show | jobs
by ebalit 1227 days ago
It's a side effect of the way the text input is represented before being used by the model. It doesn't get the text as a sequence of chars but as a sequence of tokens.

This paper [1] shows that giving character-level awareness to the model can improve the "visual spelling".

1: https://arxiv.org/abs/2212.10562