Hacker News new | ask | show | jobs
by jedimastert 2383 days ago
A somewhat different example but related (at least by name) but I recall an article from Chris Wellons' blog, Null Program, where he wanted to "discover" a new hashing algorithm, so he randomly generated them, JT compiled them to native, then tested them.

There was, how ever, no machine learning or optimizing. Instead, he called it "prospecting" and just generate a new one from scratch each time until he found something interesting.

https://nullprogram.com/blog/2018/07/31/

1 comments

Yea - that is related to genetic programming. That, and using auto-encoders for e.g. image compression are known approaches in "AI".

I'm particularly proud of this meta approach and I am actually thinking this could become huge: the same thing can be done for hyperparameter optimization in machine learning tasks.

Hyperparamter optimization is currently focused on minimizing cross-validation error, but using this concept you could have weights on accuracy, training time and prediction time (very similar to compression where the 3 dimensions are size, write time and read time), and then given a new unknown dataset you could predict what model/hyperparameters to use.

Maybe this should be patented ;)

> I'm particularly proud of this meta approach and I am actually thinking this could become huge: the same thing can be done for hyperparameter optimization in machine learning tasks.

There is already a substantial field of Machine Learning/Meta Learning which focuses on exactly this. For example, this paper [1] from NeurIPS 2015 does exactly what you suggest.

[1]: https://papers.nips.cc/paper/5872-efficient-and-robust-autom...

Yea I am aware of meta hyperparameter approach for ML, except they only focus on accuracy instead of also including train/prediction times in to the equation :) That's what I was referring to! (you can save A LOT of compute and zoom in on things that work if you can weed out slow / badly performing algorithms as part of meta learning hyperparameters).

To make it extra clear: by doing a lot of compute on different datasets and not only recording the accuracy but also time it took, and then by including that as dimension it will even give better results.