Y
Hacker News
new
|
ask
|
show
|
jobs
by
sanxiyn
1100 days ago
This is wrong, byte-level models work fine, even if not as well as word-level models. From comparison of byte-level models and word-level models, we know tokenization part is responsible for minuscule part of performance.