Hacker News new | ask | show | jobs
by gdahl 4952 days ago
I was involved in the speech recognition work mentioned in the article and I led the team that won the Merck contest if anyone has any questions about those things. I also spend some time answering any machine learning question I feel qualified to answer at metaoptimize.com/qa
2 comments

Congratulations on winning the Merck contest! That was an impressive demonstration.

About 12 years ago, I switched from a Bio major to CS. I hoped to major in AI, but after taking 2 upper level classes, one focusing on symbolic AI and the other focusing on Bayesian networks, I was completely turned off.

Our brains are massively parallel redundant systems that share practically nothing in common with modern Von Neumann CPUs. It seemed the only logical approach to AI was to study neurons. Then try to discover the basic functional units that they form in simple biological life forms like insects or worms. Keep reverse engineer brains of higher and higher life forms until we reach human level AI.

Whenever I tried to relate my course material in AI to what was actually going on in a brain, my profs met my questions with disdain and disinterest. I learned more about neurons in my high school AP Bio class than either of my AI classes. In their defense, we've come a long ways, with new tools like MRIs and neural probes.

The answers are all locked up in our heads. It took nature millions of years of natural selection to engineer our brains. If we want to crack this puzzle in our lifetimes, we to copy nature, not reinvent it from scratch. Purely mathematical theories like Bayesian statistics that have no basis in Biological systems might work in specific cases, but are not going to give us strong AI.

Are these new deep learning algorithms for neural networks rooted in biological research? Do we have to necessary tools yet to start reversing engineering the basic functional units of the brain?

We think so (http://vicarious.com/), but we are obviously biased.
I worked on the Netflix prize and haven't learned anything since then. There the RBM (or modified version per ruslan's paper) performed very well but not substantially better than the linear models (in apples to apples comparison.. ignore the time-dimension and peeking at the contents of the quiz\test set). And as I recall no one really made any progress with deeper networks on that problem. Has anything been learned since then that would suggest progress there?

I also don't recall anyone successfully incorporating the date of the rating into the RBM. Mostly this was useful in other models because on any particular day people would just bias their ratings up or down a bit. But also, as one can imagine, over the course of a year or two their tastes would change. Is it straightforward to include that time dimension into RBMs, and if so, is that a recently discovered technique?

The Netflix Prize winners had a few RBM models that used the dates.

Regarding the DBM - I also tried to use more than one layer, and without success. I tried out 3-layer and 4-layer autoencoders (can be called 1.5-layer and 2-layer DBM), with initialization by stacked RBMs or without it. It did not work well probably because: a) the model was inaccurate, and b) the learning method proposed for DBM was not completely correct. Intuitively, the right DBM-like model with the right learning method should have a chance to improve something on the Netflix task.

I found some improvement though (rather learning time than accuracy) in the standard RBMs. Instead of using CD, I split the weights into two sets, creating a directed RBM version. The "up" weights from the visible nodes to hidden are learned with CD with T=1. The "down" weights are learned to best fit the visible nodes, using the hidden nodes as predictors. The hidden nodes generated by CD T=1 are good enough, and we do not need additional iterations with increased T.