That's " 'd ".strip(), an english contraction suffix. it's 1 token, but using this echo approach you will be served the apostrophe and the subsequent letter for the first time in different steps.
I couldn't reproduce this behavior with Sonnet 4, and Sonnet 3.7 has been deprecated since I messed with this stuff. You can try tokenizing the string "<hello> </hello>"
I think the correct tokenization of the string will not have any tokens that contain mixed punctuation and letters, but the result of this approach does contain such claimed tokens.
I think the correct tokenization of the string will not have any tokens that contain mixed punctuation and letters, but the result of this approach does contain such claimed tokens.