Hacker News new | ask | show | jobs
by mywittyname 3034 days ago
> An interesting property of word vectors (which are usually 300-600 dimensional vectors) is that most are quasi-orthogonal.

I think it would be more interesting if this wasn't the case. The set of all possible English words is <200,000, with probably 10% of those being in common use. Given the small set, large number of dimensions, and the nature of language, it seems likely that non-random word vectors would tend towards orthogonality.

I'm assuming you mean that, "I will run with Bob" and "I will jog with Stacy" are not orthogonal, because they convey a very similar message, but are orthogonal to, "Man, that was a good beer."