Hacker News new | ask | show | jobs
by darkmighty 4072 days ago
I really dislike the term "Black box optimization". There's no such thing. You have to make assumptions about your function, so in the end this is just rewarding people whose optimizers happen to match the chosen functions; but those functions are not made explicit whatsoever. That doesn't make any sense.

For example, if the output/input are floating point numbers than you can assume the domain/range is [-M,M]. Otherwise, with even the most clever function you have no guarantee of ever approaching the optimum, even if the function is continuous. Now even with a limited range there are no guarantees if the function is not well behaved -- so you have to again assume the function is well behaved. And for any assumption you make there is a condition on function for which it is terrible. There is no best assumption, or best algorithm, then. You could, for instance, assume the function is adversarial (trying to make your life difficult), for which the best algorithm is perhaps just sampling randomly the range, which is really a terrible algorithm -- but that's of course just another assumption, and a terrible one.

I would much prefer 'Typical function optimization', if you're optimizing unlabeled functions so frequently, or at least not try to hide the inevitable assumptions.

TL;DR: The contest may be useful, but the concept of "Black box optimization" is nonsense.

4 comments

The domain is the unit cube [0, 1]^d using double precision floating point, see the documentation.

Making assumptions and testing them is very much part of the contest. You are even allowed to do this interactively.

Yes, there is such a thing.

There exist many more techniques than trivially assuming some "template" function and fitting the function parameters against the data.

Have a look at nonparametric modelling techniques. For example kernel regression or gaussian processes. You either don't make any assumptions, or you take an uninformative prior that distributes over all possible results.

This competition evokes modelling, optimisation and the exploration/exploitation tradeoff. I'm sure there will be very interesting theory behind the winning entries...

The point is, I don't even need to look up your techniques (although I did out of respect) to know there really isn't such a case; what I stated is a simple, almost trivial principle (apparently it has a name [1] as some pointed out).

Mathematics models data, and you can't model without assumptions. It's like developing a theory which can't have axioms. For example, kernel regression probabilistic model is a terrible model (assumption) with very large error for a large class of distributions[2], and so on. We're talking about picking the best technique; this technique is going to pick some assumptions arbitrarily that will or will not work well based on an unclear choice of the organizers. That's why I would prefer if they stated instead "Functions with some real world relevance", or "Typical functions", or maybe "Poorly behaved functions", and so on.

[1] http://en.wikipedia.org/wiki/No_free_lunch_in_search_and_opt...

[2] On the wikipedia page you can see they do make assumptions on f to minimize the squared error for choosing the kernel. It's inevitable.

You are fighting a mathematically pure interpretation of black boxes that are making no assumptions at all. Your observations are correct. But nobody actually interprets the term "black box" the way you deem wrong.

Taken from here [1]:

White-box models: This is the case when a model is perfectly known; it has been possible to construct it entirely from prior knowledge and physical insight.

Grey-box models: This is the case when some physical insight is available, but several parameters remain to be determined from observed data. It is useful to consider two subcases.

1. Physical modeling: A model structure can be built on physical grounds, which has a certain number of parameters to be estimated from data. This could, for example, be a state-space model of given order and structure.

2. Semiphysical modeling. Physical insight is used to suggest certain nonlinear combinations of measured data signal. These new signals are then subjected to model structures of black-box character.

Black-box models: No physical insight is available or used, but the chosen model structure belongs to families that are known to have good flexibility and have been 'successful in the past'.

[1] http://www.sciencedirect.com/science/article/pii/00051098950...

Fair enough. I wasn't not familiar with the literature to be honest, it was just a remark.

I still dislike the term and concept, but it's hard to argue with a conventional definition. I believe assumptions should be made as clear as possible and the term seems like a futile attempt at hiding them.

Look at it this way: Many interesting problems in engineering have expensive to evaluate objectives with generally unknown structure and noisy multi-modal results, but are still piece-wise smooth. It's true that in the space of all possible functions virtually none meet these criteria, but many practically interesting ones do.

If your function really is some a random oracle, then, indeed, no optimizer will do well against it. OTOH, none will do (relatively) poorly either.

Effective optimization techniques can explore a function generally and exploit similarities to known models or at least any smoothness they can find. Ineffective techniques will just it caught in local minima or fail to exploit smoothness or "obvious" structure.

Powerful "generic" optimizers are a tool which is important for industry. But the common ways they are benchmarked potentially allows for overfitting in the design phase, this contest is intended to correct that, and provide a potentially better assessment of how general these optimizers are.

Maybe a better term should be "blind" rather than black-box. I think the goal is simply to hold optimization to the same level of reproducibility that is expected of most scientific fields today, and if a researcher is allowed to introduce a hundred tunable parameters that makes their algorithm converge on all the standard test cases then they haven't created a reproducible optimizer - they have created a benchmark solver.
what is "typical" for one can be "rare" for the other, "black box" suggests that the participants do not know what is inside, the organizers on their side should make sure that the content is of some interest for the "real-world problems/applications"

what you are describing is related to the "no free lunch theorem", something one can attempt to deal with to get things working "in practice"

The organizers making sure it has some real world relevance is what I would equate with problems being "typical". In practice, you may find ill-characterized "Typical problems" and solve them, but as I said, a truly "Black box optimization" would not make sense; hence I dislike the term (and the general problem statement).