Hacker News new | ask | show | jobs
by Aron 4952 days ago
I worked on the Netflix prize and haven't learned anything since then. There the RBM (or modified version per ruslan's paper) performed very well but not substantially better than the linear models (in apples to apples comparison.. ignore the time-dimension and peeking at the contents of the quiz\test set). And as I recall no one really made any progress with deeper networks on that problem. Has anything been learned since then that would suggest progress there?

I also don't recall anyone successfully incorporating the date of the rating into the RBM. Mostly this was useful in other models because on any particular day people would just bias their ratings up or down a bit. But also, as one can imagine, over the course of a year or two their tastes would change. Is it straightforward to include that time dimension into RBMs, and if so, is that a recently discovered technique?

1 comments

The Netflix Prize winners had a few RBM models that used the dates.

Regarding the DBM - I also tried to use more than one layer, and without success. I tried out 3-layer and 4-layer autoencoders (can be called 1.5-layer and 2-layer DBM), with initialization by stacked RBMs or without it. It did not work well probably because: a) the model was inaccurate, and b) the learning method proposed for DBM was not completely correct. Intuitively, the right DBM-like model with the right learning method should have a chance to improve something on the Netflix task.

I found some improvement though (rather learning time than accuracy) in the standard RBMs. Instead of using CD, I split the weights into two sets, creating a directed RBM version. The "up" weights from the visible nodes to hidden are learned with CD with T=1. The "down" weights are learned to best fit the visible nodes, using the hidden nodes as predictors. The hidden nodes generated by CD T=1 are good enough, and we do not need additional iterations with increased T.