| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by carom 610 days ago
	The trained embedding vectors for the token equivalents of W4 and W-4 would be mapped to a similar space due to their appearance in the same contexts.

1 comments

dangerlibrary 610 days ago

The point of the GP post is that the "w-4" token had very different results from ["w", "-4"] or similar algorithms where the "w" and "4" wound up in separate tokens.

link