Hacker News new | ask | show | jobs
by whatidonteven 3218 days ago
How can a non-linear function even be convex in shape? I assume you mean the whole volume below or above the function and not just the function's surface itself?

Also, what about the case where the function isn't continuous or where it's not defined everywhere (the surface has holes)?

2 comments

> How can a non-linear function even be convex in shape?

I'm not sure what you mean. Apart from the linear case (which is weakly convex), most convex functions are non-linear. So yes, it is not only possible, it is the norm (in a colloquial sense). Refer to this for a mathematical definition of convexity: https://en.wikipedia.org/wiki/Convex_function

> Also, what about the case where the function isn't continuous or where it's not defined everywhere (the surface has holes)?

There are two different cases:

1) Discontinuous functions: these are by definition nonconvex e.g. step functions. Gradient-descent methods cannot handle these directly; typically they are modeled as mixed-integer problems.

2) Non-smooth functions: are convex but do not have derivatives defined everywhere. e.g. abs(x). Gradient-descent methods don't work well on these types of functions. These typically require subgradient/bundle methods, or can be modeled as discontinuous functions.

yes, it's not smooth by definition

   d/dx √(x²) = x/√(x²)
because for x=0 the derivative is 0/0, but in this case it might be interpreted as the interval [-1,1]. That's just the way it is often plotted. I come up short with an algebraic explanation, it might as well be ]-∞,∞[ (deriving it from z(x,y)=1/y for example), but I imagine this as a bundle of tangents on the origin, parameterized by the interval. That's an infinitesimal curve, not just an infinitesimal slope if you will. So it's not linear algebra (I guess). The integral of the derivative is obviously defined everywhere, explain that. It's zero at zero, because the sum of the interval is zero. And if we are only interested in the boundaries of the interval at 0, than that's the tuple (-1,1). Just like complex numbers or vectors are tuples (but this isn't a complex number I guess).
Ah, gotcha, so it's not the graph of the function that convexity refers to but the volume above the graph of the function.
Well, no, in this scenario, it is actually the function (in your words, the graph of the function) that is convex. A 2D example would be y = x^2 (a parabola), which is a convex function. A 3D example would be a paraboloid function, which is also a convex function.

The "volume" (or "area" in the 2D case) above the graph is called an epigraph.

One property of convex functions is that their epigraphs are convex sets (note the word "sets" this time). https://en.wikipedia.org/wiki/Epigraph_(mathematics)

Convex sets are more abstract in meaning, but in general in means can draw a straight-line between any two points in the region without going outside of the region.

Perhaps your notion of convexity comes from a mental idea of the shapes of convex and concave lenses? Those are good visualizations but in mathematics, convexity has a subtler, more rigorous meaning. With this rigorous meaning comes many nice mathematical properties that make optimizing them easier than nonconvex functions.

one definition of a convex function is that its epigraph

    { (x, y) : f(x) <= y }
is a convex set. Another is that the line segment between two points on the graph lies above the graph, i.e.

    (1 - t) f(x) + t f(y) >= f((1 - t) x + t y) for all 0 <= t <= 1
known as Jensen's inequality.

Convex function must be defined on a convex domain (no holes) and continuous at everywhere except the boundary of the domain.