|
|
|
|
|
by YossarianFrPrez
387 days ago
|
|
This is a cool paper! Basically, prior to feeding text to the tokenizer people have split the text on whitespaces. But whitespaces aren't exactly meaningful separators. By getting rid of this restriction as the tokenizer is 'learning', some of the tokens end up being 'by the way' or 'in the the long run.' The researchers find that this makes the model much more efficient. |
|