| Constraint programming can be viewed as looking for feasible solutions to optimization problems. So, what is optimization? For positive integers m and n, the real numbers R, and functions f: R^n --> R g: R^n --> R^m find x in R^n to (problem MP1) maximize f(x) subject to g(x) <= 0 where g(x) is regarded as an m x 1 matrix
and the 0 is also m x 1 but with all its components 0. The constraints are the g(x) <= 0 We say that we have m constraints. The subject is essentially the same
using g(x) >= 0 or g(x) = 0 instead or minimizing instead of
maximizing. Point x in R^n is feasible provided
it satisfies the constraints, that is, g(x) <= 0 The feasible region is the set
F of all feasible x. A point x is optimal if it is
feasible and makes f(x) as large
as possible. That is, x is
optimal provided for any feasible
y we have f(x) >= f(y). If one or more of the functions
f or g is non-linear, then we
have non-linear programming.
Here we may use the Kuhn-Tucker
conditions to seek a point
that satisfies necessary conditions
for optimality. For an algorithm,
we may take gradients of the
constraints to set up a linear
program, solve that, do a line search,
and try again. If functions f and g are linear, easily written using matrix algebra, then the optimization problem is linear programming (here programming is used in the British sense of operational planning). Function f is convex provided for
any x, y in R^n and any t in [0, 1]
we have t f(x) + (1 - t) f(y) >= f( t x + (1 - t) y ) That is, in a graph of f, a straight line
between (x, f(x) and (y, f(y) ) is on or
over the graph and never under the
graph. So, the straight line over
approximates the function. If f is convex, then it is continuous (proof is not very difficult but is in Fleming, Functions of Several Variables). IIRC convex implies differentiable almost everywhere (maybe proof in R. T. Rockafellar, Convex Analysis). The epigraph is the graph of f and everything above it, and it is a convex set so has supporting hyperplanes and is the intersection of all the closed half spaces from its supporting (tangent) hyperplanes (IIRC, proof in Fleming). Such a
supporting hyperplane is called a
subgradient and can be quite useful
in optimization, e.g., Lagrangian
relaxation. There is a closely related,
really nice result in Hilbert space: Every
non-empty, closed convex subset has a unique
element of minimum norm. That can
be very nice to know! To be more clear, a convex function, while
continuous, need not be differentiable.
Why not? Intuitively, at a point, the
function makes a sharp turn. Similarly, a cube is a convex set but has 10 sharp
edges and 8 sharp points that have supporting (tangent) planes, but these planes are not unique. So, a subgradient at a point would be a gradient and a
tangent plane if at that point it was
the only tangent plane. So, intuitively,
a subgradient is nearly a tangent
plane. Function f is concave provided
that -f is convex. If f is convex
and -f is convex, then f is linear --
so, intuitively, a function that
is convex (or concave)
is half of being linear. We have some astoundingly
powerful methods for solving
linear programming problems. A significant fraction of the
Nobel prizes in economics
have been from economic
models based on linear
programming. Some special cases, e.g.,
least cost network flows,
maximum matchings,
are special cases with
even better results for the
algorithms. If in MP1 function f is concave and
funtion g is
linear, then again we have some
good methods. If in MP1 function f is quadratic
and concave and function g is
linear, then we have the problem
of Markowitz in investment
portfolio management and his
Nobel prize in economics. If we are maximizing a
convex function subject to
linear constraints, then we
have, IIRC, a problem in NP-complete. If in linear programming
we require in addition that some or all of the components
of x are required to be integers,
then we have some integer
constraints and a problem in
integer linear programming (ILP).
ILP was one of the earliest
problems shown to be in
NP-complete. So, that's a nutshell version of optimization. Can also extend to accommodating randomness and
doing much the same over time. The over time part is control theory,
e.g., how to fly an airplane from one point
to another in least time -- there are books, e.g., by Athans and Falb. Mike Athans was long at MIT and Falb, at the Brown Division of Applied Mathematics. Control theory
had a flurry of interest in the space
program. There is an extension, differential game theory, which
is like how best for a missile to
chase an airplane where the airplane
is best trying to avoid the missile.
Typically in control theory and
differential game theory we look
for necessary conditions for
optimality which usually turn out
to be in terms of Lagrange
multipliers.
Optimization over
time with randomness is essentially
stochastic optimal control, and the
main technique is
a version of dynamic programming.
An early proponent was Richard
Bellman. The Black-Scholes Nobel prize work in
option pricing is a relatively
simple case of stochastic optimal
control. So, in terms of MP1, constraint programming is, say,
find x in R^n so that g(x) <= 0 that is, just find a point in the feasible
region R. An example might be, yesterday
killed 5000 hogs and hung and chilled
the results. Today cut the hogs
and put the pieces in boxes
in 18 wheel trucks, about 40,000
pounds of load per truck, for
delivery to customers about 900
miles away. Have only three bays in the loading
dock. So, want to know the order
in which to load the trucks to
get them all loaded in time. So, have a constraint satisfacation
problem and not really an optimization
problem. Fundamentally, i.e., in the sense
of computational complexity,
finding a feasible
point is as difficult as finding
an optimal point.
So, really, fundamentally,
constraint programming
is no easier than optimization. IIRC at one time SAP and ILOG
were interested in CPLEX as
means of solving constraint
satisfaction problems. So,
to solve a constraint satisfaction
problem, start with some
optimization software. |