Hacker News new | ask | show | jobs
by artifabrian 1274 days ago
Could be as well, given there's many varieties of tokenisers, each with different pros and cons.

This particular tokenizer is very interesting given that it tries to be best of both worlds (word-level tokenizer and character-level).