46 Positive definite matrices
46.1 Definite and basic properties
Definition 46.1 A \(n \times n\) symmetric matrix \(M\) is called
positive definite (written as \(M \succ 0\)) if for all \(x \in \reals^n\), \(x \neq 0\), we have \[x^\TRANS M x > 0.\]
positive semi definite (written as \(M \succeq 0\)) if for all \(x \in \reals^n\), \(x \neq 0\), we have \[x^\TRANS M x \ge 0.\]
Examples
\(\MATRIX{ 3 & 0 \\ 0 & 2 } \succ 0\) because \(\MATRIX{ x_1 & x_2 } \MATRIX{ 3 & 0 \\ 0 & 2 } \MATRIX{ x_1 \\ x_2 } = 3 x_1^2 + 2 x_2^2 > 0.\)
\(\MATRIX{ 0 & 0 \\ 0 & 2 } \succeq 0\) because \(\MATRIX{ x_1 & x_2 } \MATRIX{ 0 & 0 \\ 0 & 2 } \MATRIX{ x_1 \\ x_2 } = 2 x_2^2 \ge 0\).
46.2 Remarks on positive definite matrices
By making particular choices of \(x\) in the definition of positive definite matrix, we have that for a positive definite matrix \(M\),
- \(M_{ii} > 0\) for all \(i\)
- \(M_{ij} < \sqrt{M_{ii} M_{jj}}\) for all \(i \neq j\).
However, satisfying these inequalities is not sufficient for positive definiteness.
A symmetric matrix is positive definite (respt. postive semi-definite) if and only if all of its eigenvalues are positive (respt. non-negative).
Therefore, a sufficient condition for a symmetric matrix to be positive definite is that all diagonal elements are positive and the matrix is diagonally dominant, i.e., \(M_{ii} > \sum_{j \neq i} | M_{ij}|\) for all \(i\).
If \(M\) is symmetric positive definite, then so is \(M^{-1}\).
If \(M\) is symmetric positive definite, then \(M\) has a unique symmetric positive definite square root \(R\) (i.e., \(RR = M\)).
If \(M\) is symmetric positive definite, then \(M\) has a unique Cholesky factorization \(M = T^\TRANS T\), where \(T\) is upper triangular with positive diagonal elements.
The set of positive semi-definite matrices forms a convex cone.
Positive definiteness introduces a partial order on the convex cone of positive semi-definite matrices. In particular, we say that for two positive semi-definite matrices \(M\) and \(N\) of the same dimension, \(M \succeq N\) if \(M - N\) is positive semi-definite. For this reason, often \(M \succ 0\) and \(M \succeq 0\) is used a short-hand to denote that \(M\) is positive definite and positive semi-definite.
Let \(M\) is a symmetric square matrix. Let \[ λ_1(M) \ge λ_2(M) \ge \dots \ge λ_n(M) \] denote the ordered (real) eigenvalues of \(M\). Then \[ λ_1(M)I \succeq M \succeq λ_n(M)I. \]
If \(M \succeq N\), then \[ λ_k(M) \ge λ_k(N), \quad k \in \{1, \dots, n\}. \]
If \(M \succeq N \succ 0\), then \[ N^{-1} \succeq M^{-1} \succ 0. \]
If \(M \succeq N\) are \(n × n\) matrices and \(T\) is a \(m × n\) matrix, then \[ T^\TRANS M T \succeq T^\TRANS N T. \]
If \(M, N\) are \(n×\) positive semi-definite matrices, then \[ \sum_{i=1}^k λ_i(M) λ_{n-i+1}(N) \le \sum_{i=1}^k λ_i(MN) \le \sum_{i=1}^k λ_i(M)λ_i(N), \quad k \in \{1, \dots, n\}. \] Note that this property does not require \(M - N\) to be positive or negative semi-definite.
If \(M \succ 0\) and \(T\) are square matrices of the same size, then \[ TMT + M^{-1} \succeq 2T. \]
46.3 Schur Complement
Symmetric block matrices of the form
\[ C = \MATRIX{ A & X \\ X^\TRANS & B } \]
often appear in applications. If \(A\) is non-singular, we can write
\[ \MATRIX{A & X \\ X^\TRANS & B } = \MATRIX{I & 0 \\ X^\TRANS A^{-1} & I} \MATRIX{A & 0 \\ 0 & B - X^\TRANS A^{-1} X } \MATRIX{I & A^{-1} X \\ 0 & I } \] which shows that \(C\) is congruent to a block diagonal matrix, which is positive definite when its diagonal blocks are postive definite. Therefore, \(C\) is positive definite if and only if both \(A\) and \(B - X^\TRANS A^{-1} X\) are positive definite. The matrix \(B - X^\TRANS A^{-1} X\) is called the Schur complement of \(A\) in \(C\).
An immediate implication of the above is that \[ \det(C) = \det(A) \det(B - X^\TRANS A^{-1} X). \]
46.4 Determinant bounds
Proposition 46.1 (Fischer’s inequality) Suppose \(A\) and \(C\) are positive semidefinite matrix and \[ M = \MATRIX{A & B \\ B^\TRANS & C}. \] Then \[ \det(M) \le \det(A) \det(C). \]
Recursive application of Fischer’s inequality gives the Hadamard’s inequality for a symmetric positive definite matrix: \[ \det(A) \le A_{11} A_{22} \cdots A_{nn}, \] with equality if and only if \(A\) is diagonal.
Proposition 46.2 If \(M \succ N \succ 0\) are \(n × n\) matrices and \(T\) is a \(m × n\) matrix, then \[ \sup_{ T \neq 0} \frac{ \| T^\TRANS M T \| }{ \| T^\TRANS N T \|} \le \frac{ \det(M) }{ \det(N) }, \] where for any matrix \(M\), \[ \| M \| = \sup_{x \neq 0} \frac{ \| M x \|_2 }{ \|x\|_2 } \] is the \(2\)-norm of the matrix.
Proposition 46.2 is taken from (Abbasi-Yadkori2011?).
46.5 Matrix functions and semidefinite ordering
There are multiple methods for lifting a function on reals to a function on matrices. The simplest method, which works for real symmetric (or complex Hermitian) matrices.
Given a function \(f \colon \reals \to \reals\), we can extend it to diagonal matrices by applying the function to each diagonal entry: \[ \bigl[ f(Λ) \bigr]_{ii} = f(Λ_{ii}) \quad \text{for each index $i$}. \]
We can then extend \(f\) to all Hermitian matrices using the eigenvalue decomposition. Given \(A = P Λ P^*\), define \[ f(A) = f(P Λ P^\TRANS) = P f(Λ) P^*. \] Thus, each eigenvalue of \(f(A)\) has the form \(f(λ)\), where \(λ\) is the eigenvalue of \(A\).
With the above definition, inequalities for real functions extend to semidefinite relations for matrix function. In particular, suppose there are functions \(f, g \colon \reals \to \reals\) and an set \(I\) such that \[ f(a) \le g(a), \quad \text{for } a \in I. \] Suppose \(A = P Λ P^*\) is a Hermitian matrix such that all eigenvalues of \(A\) lie in \(I\). Then, \(f(Λ) \preceq g(Λ)\), and therefore \(P f(Λ) P^* \preceq P(g(Λ)P^*\). Thus, \[ f(A) \preceq g(A). \]
For example, since \(1 + a \le e^a\) for real \(a\), we have \[ I + A \preceq e^A, \quad \text{for each Hermitian $A$}. \]
A function \(f\) is called operator monotone if for any Hermitian matrices \(A\) and \(B\), we have \[ A \preceq B \implies f(A) \preceq f(B). \]
A function \(f\) is called operator concave if for all positive definite matrices \(A\) and \(B\) and constants \(α \in [0,1]\), we have \[ αf(A) + (1-α)f(B) \preceq f(αA + (1-α)B). \]
Matrix logarithm is both operator monotone and operator concave. Matrix exponential does not belong to either class.
References
The properties of positive definite matrices are stated in any book on the theory of matrices. See for example Marshall et al. (2011) and Bhatia (2015).
Historically, a matrix used as a test matrix for testing positive definiteness was the Wilson matrix \[ W = \MATRIX{5 & 7 & 6 & 5 \\ 7 & 10 & 8 & 7 \\ 6 & 8 & 10 & 9 \\ 5 & 7 & 9 & 10}. \] For a nice overview of Wilson matrix, see this blog post.