Hacker News new | ask | show | jobs
by danger 5072 days ago
As another commenter pointed out, the accuracy really needs to be evaluated using a validation set, not the test set--the approach described in the post is training with the testing data. In the field, we call this "cheating".

The basic idea of automatically tuning hyperparameters (the things this post discusses tuning with genetic algorithms) is cool, though, and is becoming a popular subject in machine learning research. A couple recent research papers on the topic are pretty readable:

Algorithms for Hyper-Parameter Optimization:

http://books.nips.cc/papers/files/nips24/NIPS2011_1385.pdf

Practical Bayesian Optimization of Machine Learning Algorithms:

http://arxiv.org/abs/1206.2944

3 comments

Also, for people who are interested in the application of predicting college basketball with machine learning, there's a Google group that is worth joining:

https://groups.google.com/group/machine-march-madness

Thanks for the information! I've updated the article to reflect this.

Here's a question: where does "the field" hang out? Is there a cohesive community of any sort?

I'd say the closest thing to a cohesive community would be the MetaOptimize Q&A forum, but maybe others have other suggestions:

http://metaoptimize.com/qa

The Kaggle forums can be a good resource, and competing there is a good way to polish your skills.

Also, you might check out Random Forest algorithms--they're high performance but still very beginner friendly, as there aren't many parameters to tune. There's a nice implementation in the excellent scikits-learn python library.

Hmm, so gaussian distributions are easy to use and ubiquitous and all (they're the basis functions used in SVM), except that I don't see any reason for them to be priors here. But since 2012 > 2008 I feel like I'm obligated (and I'm semi-trolling) to point out the obvious about lazy assumptions based on "flexibility and tractability", which is that they can implode hilariously in your face. C.f. the financial crisis.

[1] http://econometricsense.blogspot.com/2011/03/copula-function...

I think you're confusing the Gaussian "process" used in Bayesian optimization with a standard Gaussian distribution. They are very different things - as are Gaussian copulas and what is referred to as the 'Gaussian kernel' (which is not actually a distribution at all) in the SVM. The Gaussian process is a distribution over functions, the properties of which are governed by the covariance function - so the prior over the function, or the assumption about its complexity and form, is determined by the choice of covariance function. Of course it is very important to choose a prior that corresponds to the functional form you are interested in, which is actually discussed and empirically validated in the literature referred to in that post. It's a bit ironic that you are claiming to point out the dangers of making lazy assumptions by doing exactly that.