# 46 Convex sets and convex functions

## 46.1 Convexity

**Definition 46.1 (Convex sets and convex functions)** A subset \(C\) of \(\reals^n\) is **convex** if for every \(x_0, x_1 \in C\), the line segment \([x_0, x_1] \in C\), i.e., \[ (1-λ) x_0 + λ x_1 \in C \quad \text{for all } λ \in (0,1). \]

A function \(f\) on a convex set \(C\) is **convex** relative to \(C\) if for every \(x_0, x_1 \in C\), one has \[
f( (1-λ)x_0 + λ x_1 ) \le (1-λ) f(x_0) + λ f(x_1)
\quad \text{for all } λ \in (0,1).
\]

\(f\) is **strictly convex** if this inequality is strict for points \(x_0 \neq x_1\) with \(f(x_0)\) and \(f(x_1)\) finite.

A function is said to be **concave** when \(-f\) is convex.

## 46.2 Convex combinations

A **convex combination** of elements \(x_0, x_1, \dots, x_p \in \reals^n\) is a linear combination \(\sum_{i=0}^p λ_i x_i\) where the coefficients \(λ_i\) are non-negative and add to one. In case of two elements, convex combinations can be equivalently expressed as \((1-λ)x_0 + λx_1\) with \(λ \in [0,1]\), which we have already seen. The next result shows that the definition of convexity in terms of two elements given earlier generalizes to multiple elements.

**Proposition 46.1** A set \(C \subset \reals^n\) is convex if and only if \(C\) contains all convex combinations of its elements.

A function \(f\) is convex relative to a convex set \(C\) if and only if for every choice of points \(x_0, \dots, x_p \in C\), we have: \[\begin{equation}\label{eq:Jensen-inequality} f\biggl( \sum_{i=0}^p λ_i x_i \biggr) \le \sum_{i=0}^p λ_i f(x_i) \quad \text{when } λ_i \ge 0, \sum_{i=0}^p λ_i = 1. \end{equation}\]

Inequality \eqref{eq:Jensen-inequality} is also call **Jensen’s inequality**.

In many applications, it is convenient to consider function \(f\) that are allowed to be *extended-real-valued*, i.e., take values in \(\bar \reals = [-∞, ∞]\) instead of just \(\reals = (-∞, ∞)\). The extended real line has all the properties of a compact interval: every subset \(A \subset \bar \reals\) has a supremum and an infimum, either of which could be infinite.

When extending arithmetic operations to the extended real line, most rules extend in a natural manner but we have to be careful with two operations: \(0 ⋅ ∞\) and \(∞ - ∞\). We will define \[ 0 ⋅ ∞ = 0 = 0 ⋅ (-∞). \]

For convexity, one follows the *inf-addition* convention: \[
∞ + (-∞) = (-∞) + ∞ = ∞.
\]

Extended arithematic then obeys the associative, commutative, and distributive laws of ordinary arithematic with one crucial exception: \[ λ ⋅ (∞ - ∞) \neq (λ ⋅ ∞ - λ ⋅ ∞) \quad \text{when } λ < 0. \]

The definition of convex functions can be generalized to function defined over the extended real line as long as we invoke the *inf-addition* convention.

Any convex function \(f\) on a convex set \(C \in \reals^n\) can be identified with a convex function on all of \(\reals^n\) by defining \(f(x) = ∞\) for all \(x \not\in C\). Such functions are called **proper** convex functions (in contrast to **improper** convex functions which take the value \(f(x) = -∞\) for all \(x \in \text{int}(\text{dom} f)\).

## 46.3 Properties of convex function

**Proposition 46.2** Let \(I\) be an index set. Then:

**Intersection of convex functions is convex**:

\(\bigcap_{i \in I} C_i\) is convex if each set \(C_i\) is convex.**Supremum of convex functions is convex:**

\(\sup_{i \in I} f_i\) is convex if each function \(f_i\) is convex.**Supremum of finite number of strictly convex functions is strictly convex:**

\(\sup_{i \in I} f_i\) is strictly convex if each function \(f_i\) is strictly convex and \(I\) is finite.**Pointwise supremum of a collection of convex functions is convex:**

\(f\) is convex if \(f(x) = \lim\sup_{i \in I} f_i(x)\) for all \(x\) and each \(f_i\) is convex.

The next result presents equivalent characterizations of convexity.

**Proposition 46.3** For a differentiable function \(f\) on an open convex set \(O \subset \reals^n\), each of the following is both necessary and sufficient for \(f\) to be convex on \(O\):

\(\langle x_1 - x_0, \GRAD f(x_1) - \GRAD f(x_0) \rangle \ge 0\) for all \(x_0, x_1 \in O\).

\(f(y) \ge f(x) + \langle \GRAD f(x), y - x \rangle\) for all \(x, y \in O\).

\(\GRAD^2 f(x)\) is pointwise-semidefinite for all \(x \in O\) (\(f\) twice differentiable).

## Notes

The material of these notes is adapted from Rockafellar and Wets (2009).