| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dimatura 2604 days ago

I'm old enough to have started doing machine learning before the deep learning revolution. When it happened, I predicted we'd see a resurgence in metaheuristics (of which ACO, used here, is an example), which just like neural networks in the pre-deep learning era, have a poor reputation among researchers. And it pretty much happened, though at first disguised as "bayesian optimization" [1].

I took an undergrad course in evolutionary optimization, and became briefly excited about it, so I'm fairly familiar with the ideas in that area. I think that similarly to neural networks, they don't really have a great mathematical basis to say what will work and what won't -- it's a lot of empirical experimentation. Do they work? Yeah, kinda. There's really few other options when you have to deal with large, combinatorial spaces such as neural network architectures. I do think a lot of research in the metaheuristics area, at least a few years ago (I haven't really kept up with it) is pretty bogus -- I lampooned it in a couple of "papers" (http://oneweirdkerneltrick.com/spectral.pdf and http://oneweirdkerneltrick.com/catbasis.pdf). Yes, all the citations are real.

[1] Bayesian optimization is great, though I find it amusing that people who wouldn't touch a genetic or swarm algorithm are totally fine with BO when it's really not that different.

3 comments

_raoulcousins 2603 days ago

There are some people fighting the good fight for more disciplined metaheuristics research. Metaheurstics - the Metaphor Exposed is a nice read (https://www.cs.ubc.ca/~hutter/EARG.shtml/stack/2013_Sorensen...). The author is spot on that a lot of the nature-inspired algorithms don't even make sense as metaphors and they're just obfuscating the descriptions of algorithms unnecessarily.

link

dimatura 2603 days ago

Looks like an interesting paper - I agree with your single-sentence summary, at least. I'll check it out.

link

dimatura 2603 days ago

Update: read the paper. It's spot on, great read.

link

p1esk 2604 days ago

What do you mean they “kinda” work? NAS is all the rage these days. SOA on Imagenet [1], SOA for mobile [2]. Still needs a ton of gpus, but search algorithms getting smarter every month.

PS I have to admit, your papers made me laugh :)

[1] https://arxiv.org/abs/1811.06965

[2] http://www.arxiv-sanity.com/1807.11626v2

link

dimatura 2603 days ago

Oh yeah, can't argue with results. Similar to deep learning. I've used both deep learning and metaheuristics a fair amount -- I don't care too much about mathematical rigor ;). I just mean, it's the sort of thing that usually needs experimentation, domain knowledge and maybe a bit of luck.

link

hooloovoo_zoo 2604 days ago

I want to know what 1's CIFAR transfer results are w/o cutout.

link

p1esk 2604 days ago

FYI, they compare their cifar results to [1], which is more effective than plain cutout.

[1] https://arxiv.org/abs/1805.09501

link

hooloovoo_zoo 2604 days ago

Heh, I'm familiar with this one too. It implies that, for instance, the Shake-Shake and Shake-Drop papers employ cutout, which they don't report. It's hard to make apples to apples comparisons when they're changing lots of things at the same time.

link

mv4 2604 days ago

"The normal (left) and paranormal (right) distributions."

Your papers are fantastic.

link

dimatura 2603 days ago

:) I forgot to say "our papers" -- credit also goes to the coauthor.

link