| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by barisser 4223 days ago

You're right about normalizing the user preference vector: not strictly necessary. Still if we're normalizing bounty vectors, it feels right to normalize everything :).

Another variation that I was considering is to generate 'user clusters'. In Mark Vector Space, divide user vectors into N groups such that the net variance across all clusters is minimized. Then when a user, for which there is sparse data, needs contextual information from other users, I could simply ask how correlated he is to the different clusters. If each cluster's 'center of mass' is a vector, the dot product between a new user and the different cluster vectors could be informative in reconstructing suggestions: the idea being to infer from similar users what a particular user might want.

I was also wondering whether adding a stochastic component to each user-vector would be interesting.

Thanks for the feedback.