Convex sets, convex functions, and Jensen's inequality
Convexity is the single most important structural property in optimization. A set is convex if the line segment between any two points in the set stays entirely inside. A function is convex if the chord between any two points on its graph lies above the graph. These simple definitions have profound consequences: convex optimization problems have no local minima traps.
Test whether various 2D shapes are convex by sampling random line segments. A circle and a square are convex. A star and an L-shape are not -- some line segments escape.
Key insight: The intersection of any collection of convex sets is convex. This is why feasible regions of linear programs (intersections of half-planes) are always convex.
A function is convex if the chord between any two points lies above the graph. Equivalently, f''(x) ≥ 0 everywhere. Drag the sample points to verify convexity visually.
The dashed chord connects f(a) and f(b). If the chord lies above the graph, the function is convex on that interval.
Key insight: For differentiable functions, convexity means the tangent line always lies below the graph. For twice-differentiable functions, convexity is equivalent to a non-negative second derivative.
Jensen's inequality says that for a convex function f, f(E[X]) ≤ E[f(X)]. The function of the average is at most the average of the function. Drag the points and weights to see the gap.
For convex f: f(E[X]) ≤ E[f(X)]. For concave f the inequality reverses. The gap between the two points shows the strength of the inequality.
Key insight: Jensen's inequality underlies many results in information theory, statistics, and finance. It explains why diversification reduces risk and why the AM-GM inequality holds.