Hacker News new | ask | show | jobs
by alexchamberlain 236 days ago
Thanks, that's really interesting. Do they correct for spelling mistakes or internationalised spellings? For example, does `colour` and `color` end up in the same token stream?
1 comments

No it just looks at exact character sequences, try it out yourself here: https://platform.openai.com/tokenizer