Hacker News new | ask | show | jobs
by chuckcode 4226 days ago
Very interesting article with some cool approaches. I'd be really interested to know how often the model needs to be trained? Seems like a lot of purchases are holiday/seasonally relevant and you'd hate to be suggesting valentine's day gifts on Feb 20th because everybody was buying them 2 weeks ago. Also be great to see any insights on how many features you need to get a good guess on a users tastes and preferences? Are 20 numbers enough to represent most of the dimensionality of Etsy products or 100 or 1000?
1 comments

In their recent KDD article [1], Etsy used 200 SVD dimensions.

For those interested in trying this out in Python:

* `gensim` contains stochastic SVD for large data (fast online model training) [2]

* I wrote a benchmark of (approximate) nearest neighbour libraries in Python [3]

[1] https://dl.dropboxusercontent.com/u/2143857/papers/topics.pd...

[2] https://github.com/piskvorky/gensim/

[3] http://radimrehurek.com/2013/12/performance-shootout-of-near...