|
|
|
|
|
by marcinzm
1374 days ago
|
|
Aren't typos just a question of how you generate your vectors/embeddings? I'd be surprised if a transformer with a character level tokenizer trained on a representative source of data (ie: with typos) wouldn't be able to make sense of typos. |
|
However, I'm sure there are still applications where you don't have access to a robust embedding for your domain but can apply other techniques to deal with that domain's noise.