| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yuvalpinter 360 days ago
	We have a paper under review that's gonna be up on arXiv soon, where we test this for ~10,000 words and find consistent decline in counting ability based on how many characters are in the tokens where the target character appears. It seems that models know "which character" is a single-character token but really doesn't get much about the inner composition of multi-character tokens.

1 comments

hnaccount_rng 360 days ago

Isn't that a rather trivial result? Or at least expected? Unless you manually encode the "this token consists of those tokens" information those are completely independent things for the model?

link