Hacker News new | ask | show | jobs
by bearzoo 3747 days ago
Well - just my two cents. The title feels inaccurate. You all are tuning hyper parameters with respect to the performance of the classification task. The bayesian optimization is really to optimize the unsupervised -> supervised pipeline. I was expecting some bayesian optimization of strictly unsupervised representation learning (ex. we have an autoencoder and use some bayesian optimization to tune hyper parameters in order to minimize a reconstruction error). This is really just supervised learning with even less supervision (which is quite typical).
1 comments

Thanks for the note!

We're using Bayesian optimization to tune both the hyperparameters of the unsupervised model and the supervised model, but you are correct that they are being done in unison with the overall accuracy being the target. The lift you get from adding the unsupervised step (and tuning it) is quite substantial (and statistically significant).

The idea of tuning just the unsupervised part (or doing it independently) is great though. All the code for the post is available at https://github.com/sigopt/sigopt-examples/tree/master/unsupe.... It would be interesting to see if doing that would make for a better overall accuracy.