47 Matrix trace
The trace of a square \(n × n\) matrix \(A\) is defined as the sum of the diagonal components of the matrix, i.e., \[ \TR(A) = \sum_{i=1}^n A_{ii}. \]
47.1 Properties of trace
For any \(A \in \reals^{n × m}\) and \(B \in \reals^{m × n}\), \[ \TR(AB) = \TR(BA). \]
An immediate consequence of the above is that for any \(A \in \reals^{n × m}\), \(B \in \reals^{m × \ell}\), and \(C \in \reals^{\ell × n}\), \[ \TR(ABC) = \TR(BCA) = \TR(CAB). \]
For \(A, B \in \reals^{n × n}\) and any scalars \(α\) and \(β\), \[ \TR(αA + βB) = α \TR(A) + β \TR(B). \]
In fact, it can be shown that trace is the only function \(f\) that satisfies the following three properties: for any \(A,B \in \reals^{n × n}\) and scalars \(α\), \(β\):
- \(f(αA + βB) = αf(A) + βf(B)\).
- \(f(AB) = f(BA)\).
- \(f(I) = n\).
For a square matrix, trace is equal to the sum of the eigenvalues, i.e., \[ \TR(A) = \sum_{i = 1}^n λ_i(A). \]
Hence, \[ \ABS{\TR(A)} \le \sum_{i=1}^n \ABS{λ_i(A)} \le n ρ(A) \] where \(ρ(A)\) denotes the spectral radius of \(A\).
Recall that a matrix \(A\) is nilpotent if there exists a postive integer \(k\) such that \(A^k = 0\). If \(λ\) is an eigenvalue of \(A\), then \(λ^k\) must be \(0\). Thus, all eigenvalues of a nilpotent matrix are \(0\). Consequently, the trace of a nilpotent matrix is \(0\).
For positive semidefinite matrices (of the same size), trace is sub-multiplicative, i.e., \[ 0 \le \TR(AB) \le \TR(A)\TR(B). \]
Thus, \(\TR(A^2) \le (\TR(A))^2\).
The above can be generalized to arbitrary powers. For positive semidefinite matrices (of the same size), and for any positive integer \(m\): \[ 0 \le \TR((AB)^m) \le (\TR(A^{2m}))^{1/2} (\TR(B^{2m}))^{1/2} \] and \[ \TR((AB)^m) = (\TR(AB))^m. \]
Another generalization is the following. Let \(A_1,\dots,A_m\) be positive semidefinite matrices (of the same size) and \(p_1,\dots, p_m\) are positive numbers such that \[ \frac{1}{p_1} + \cdots + \frac{1}{p_m} = 1. \] Let \(\ABS{A} \coloneqq (A^\TRANS A)^{1/2}\) Then, \[ \TR(\ABS{A_1A_2 \cdots A_m}) \le \prod_{i=1}^m (\TR(A^{p_i}_i))^{1/p_i} \] and \[ \TR(\ABS{A_1A_2 \cdots A_m}) \le \TR\biggl(\sum_{i=1}^m \frac{1}{p_i} A_i^{p_i}\biggr). \]
The next result removes the restriction to positive semidefinite matrices (at the cost of working with absolute values). Let \(A_1,\dots,A_m\) be arbitrary \(n × n\) matrices and \(p_1,\dots, p_m\) are positive numbers such that \[ \frac{1}{p_1} + \cdots + \frac{1}{p_m} = 1. \] Let \(\ABS{A} \coloneqq (A^\TRANS A)^{1/2}\) Then, for any integer \(r \ge 1\), \[\begin{equation} \TR(\ABS{A_1A_2 \cdots A_m}^r) \le \prod_{i=1}^m (\TR(\ABS{A_i}^{r p_i}))^{1/p_i} \tag{H\"older inequality} \end{equation}\] and \[\begin{equation} \TR(\ABS{A_1A_2 \cdots A_m}^r) \le \TR\biggl(\sum_{i=1}^m \frac{1}{p_i} \ABS{A_i}^{rp_i}\biggr). \tag{Young inequality} \end{equation}\]
Let \(f \colon \reals \to \reals\) be a non-increasing function and \(A\) and \(B\) be Hermitian matrices such that \(A \preceq B\). Then, \[ \TR(f(A)) \preceq \TR(f(B)). \] See notes on positive semidefinte matrix to see how to define \(f(A)\), etc.
If \(f \colon [a,b] \to \reals\) is convex, then \(\TR(f(A))\) is convex on the set of all Hermitian matrices whose eigenvalues lie in \([a,b]\).
Let \(H\) be a fixed Hermitian matrix. Then the function \[ A \mapsto \TR(\exp(H + \log(A))) \] is concave on set of positive definite matrices. See (Lieb 1973, Theorem 6) for proof.
Notes
See Yang and Feng (2002) and Shebrawai and Albadawani (2012) for generalizations of the above inequalities.