| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kristianp 4803 days ago
	So each language is stored with a prefix in the Judy array. This means that to identify the language of a token, you have to loop for all languages, prefixing the token and looking it up, keeping a count of matches for each language. Does that sound correct? I wouldn't have used the prefix approach, instead storing a token once in the judy array, and using the data stored to indicate which languages match the token.