| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by radarsat1 3043 days ago
	I suppose that if you're doing multilanguage, this problem partially sorts itself out. E.g. Spanish there will be Apple and manzana, in two different places due to their different semantics. Now for English, say you were trying to place "apple" in that space, you would want to put it next to both of them. Unfortunately I see a problem in having to specify an exact position per word. If you think of the position of english "Apple" in the Spanish word space as a distribution instead of a specific location, then it ideally should be a two-mode distribution, with one peak next to Apple and one peak next to manzana. If you must use a normal distribution, the variance must be wide enough to cover both words -- a huge problem, since (a) that assigns a lot of probable values to one word and (b) the mean value (expected value) lies between them, not at the semantic location of "apple" at all.