|
|
|
|
|
by exgrv
3034 days ago
|
|
We decided to keep the casing, as it is useful for some applications such as named entity recognition. Regarding the punctuation, as pointed out in another comment, these tokens might also be useful for some applications (and they are easy to filter out if you don't need them). |
|
And yes I realize this is a really odd question :)