Section 2.6 Symmetric matrices
For \(C^2\) functions \(f\text{,}\) the equality of mixed partials \(\partial_i \partial_j f = \partial_j \partial_i f\) means that the Hessian matrix \(D^2 f\) introduced in Notation 2.10 is symmetric. In this section we recall some relevant results from Linear Algebra (Algebra 2A) and prove a corollary which we will need later.
Definition 2.27. Orthogonality.
- We denote the standard inner product between vectors in \(\R^N\) by\begin{equation*} \langle x,y\rangle = x \cdot y = x^\top y = x_i y_i\text{.} \end{equation*}
- An orthonormal basis of \(\R^N\) is a basis \(\{q_1,\ldots,q_N\}\) with \(\langle q_i,q_j\rangle = \delta_{ij}\text{.}\)
- An orthogonal matrix is a matrix \(U \in \R^{N \times N}\) whose columns form an orthonormal basis. Equivalently, \(U^{-1} = U^\top\text{.}\)
Theorem 2.28. Spectral theorem for symmetric matrices.
Let \(A \in \R^{N \times N}\) be a symmetric matrix. Then \(A\) is diagonalizable with real eigenvalues \(\lambda_1,\ldots,\lambda_N\) (repeated according to multiplicity). Moreover, the eigenvectors of \(A\) form an orthonormal basis. Equivalently, there exists an orthogonal matrix \(U\) and a diagonal matrix \(\Lambda=\operatorname{diag}(\lambda_1,\ldots,\lambda_N)\) such that \(A=U\Lambda U^\top\text{.}\)
Corollary 2.29. Bounds on quadratic forms.
Let \(A \in \R^{N \times N}\) be a symmetric matrix, and let \(\lambda_{\min},\lambda_{\max}\) be its smallest and largest eigenvalues. Then, for any \(\xi \in \R^N\text{,}\) we have
\begin{equation}
\lambda_{\min} \abs \xi^2
\le
\langle A\xi,\xi\rangle = A_{ij} \xi_i \xi_j
\le \lambda_{\max} \abs \xi^2\text{.}\tag{2.4}
\end{equation}
Moreover, \(\lambda_{\min}\) is the largest real number such that the first inequality holds for all \(\xi\text{,}\) while \(\lambda_{\max}\) is the smallest real number such that the second inequality holds.
Proof.
Applying Theorem 2.28, let \(q_1,\ldots,q_N\) be an orthonormal basis associated to the eigenvalues \(\lambda_1,\ldots,\lambda_N\) of \(A\text{.}\) Then any \(\xi \in \R^N\) can be written as \(\alpha_i q_i\) for \(\alpha \in \R^N\text{,}\) and moreover \(\abs \xi = \abs \alpha\text{.}\) We calculate
\begin{align*}
\langle A\xi,\xi\rangle
\amp =
\langle A(\alpha_i q_i),\alpha_j q_j\rangle \\
\amp =
\langle\alpha_i \lambda_i q_i,\alpha_j q_j\rangle \\
\amp =
\lambda_i \alpha_i \alpha_j \langle q_i,q_j\rangle \\
\amp =
\lambda_i \alpha_i \alpha_j \delta_{ij}\\
\amp =
\lambda_i \alpha_i \alpha_i\text{.}
\end{align*}
Applying the inequalities \(\lambda_{\min} \le \lambda_i \le \lambda_{\max}\text{,}\) we immediately obtain (2.4). The final statement of the lemma then follows by letting \(\xi\) be the eigenvector associated to \(\lambda_{\min}\) or \(\lambda_{\max}\text{.}\)
Definition 2.30. Definite and semi-definite.
Let \(A \in \R^{N \times N}\) be symmetric. We call \(A\)
- positive definite if all its eigenvalues are \(\gt0\text{,}\)
- negative definite if all its eigenvalues are \(\lt0\text{,}\)
- positive semi-definite if all its eigenvalues are \(\ge0\text{,}\)
- negative semi-definite if all its eigenvalues are \(\le0\text{,}\)
- and indefinite if it has both positive and negative eigenvalues.
Lemma 2.31. Inner product between semi-definite matrices.
Let \(A,B \in \R^{N \times N}\) be positive semi-definite symmetric matrices. Then
\begin{equation*}
A_{ij}B_{ij} = \trace(A^\top B) \ge 0\text{.}
\end{equation*}
Proof.
Applying Theorem 2.28 to \(A\) and \(B\text{,}\) we can write \(A=U\Lambda U^\top\) and \(B=VMV^\top\) where where \(U,V\) are orthogonal and \(\Lambda = \operatorname{diag}(\lambda_1,\ldots,\lambda_N)\) and \(M =
\operatorname{diag}(\mu_1,\ldots,\mu_N)\) are diagonal matrices whose diagonal entries are the eigenvalues of \(A\) and \(B\text{.}\) In index notation, we calculate
\begin{equation*}
A_{ij}
= (U\Lambda U^\top)_{ij}
= U_{ik} \Lambda_{k\ell} U^\top_{\ell j}
= U_{ik} \lambda_k \delta_{k\ell} U_{ j\ell}
= \lambda_k U_{jk} U_{ik}
\end{equation*}
and similarly \(B_{ij} = \mu_\ell V_{j\ell} V_{i\ell}\text{.}\) Multiplying these formulas and grouping terms, we find
\begin{align*}
A_{ij} B_{ij}
\amp =
\lambda_k \mu_\ell U_{jk} V_{j\ell} U_{ik} V_{i\ell}\\
\amp =
\lambda_k \mu_\ell (U_{kj}^\top V_{j\ell}) (U_{ki}^\top V_{i\ell})\\
\amp =
\lambda_k \mu_\ell (U^\top V)_{k\ell} (U^\top V)_{k\ell}\\
\amp =
\lambda_k \mu_\ell \big((U^\top V)_{k\ell}\big)^2
\ge 0,
\end{align*}
where in the last step we have used the non-negativity of the eigenvalues \(\lambda_k\) and \(\mu_\ell\text{.}\)
Exercises Exercises
1. Lemma 2.31 for positive definite matrices.
(a)
Find two positive semi-definite matrices \(A,B \in \R^{2\times2}\) for which \(A_{ij}B_{ij} = 0\text{.}\)
(b)
If \(A,B \in \R^{N\times N}\) are positive definite symmetric matrices, show the strict inequality \(A_{ij}B_{ij} \gt 0\text{.}\)
Hint.
Following the proof of Lemma 2.31, we can write \(A_{ij}
B_{ij} = \lambda_k \mu_\ell (U^\top V)_{k\ell}^2 \ge 0\text{.}\) What would have to happen in order for the double sum on the right hand side to be exactly zero?
2. (PS3) Invariance of the Laplacian.
Let \(A \in \R^{N \times N}\) be an orthogonal matrix and \(u \in C^2(\R^N)\text{,}\) and define \(v \in C^2(\R^N)\) by
\begin{equation*}
v(x) = u(Ax).
\end{equation*}
By repeatedly using the chain rule, show that
\begin{equation*}
\Delta v(x) = \Delta u(Ax).
\end{equation*}
In other words, \(\Delta (u \circ A) = \Delta u \circ A\text{.}\) We say that the Laplacian \(\Delta\) is invariant under orthogonal transformations.
Hint 1.
Since \(A\) is orthogonal, \(AA^\top\) is the identity matrix, i.e. \(A_{ik} A_{jk} = \delta_{ij}\text{.}\)
Hint 2.
Recall Notation 2.14.
Solution.
As always, we use the convention in Notation 2.14. Repeatedly using both the chain rule and the second part of Exercise 2.2.1, we calculate
\begin{align*}
\Delta v
\amp =
\partial_i \partial_i [u(Ax)]\\
\amp =
\partial_i[ \partial_j u(Ax)\partial_i (Ax)_j ]\\
\amp =
\partial_i[ \partial_j u(Ax) A_{ji} ]\\
\amp =
\partial_{kj} u(Ax) A_{ji} \partial_i(Ax)_k\\
\amp =
\partial_{kj} u(Ax) A_{ji} A_{ki}\\
\amp =
\partial_{kj}u(Ax) \delta_{jk}\\
\amp =
\partial_{jj}u(Ax)\\
\amp =
\Delta u(Ax)\text{,}
\end{align*}
where towards the end we have used the identity
\begin{equation*}
A_{ji}A_{ki}
= (AA^\top)_{jk}
= \delta_{jk}\text{,}
\end{equation*}
which holds since \(A\) is orthogonal.
3. The Laplacian and positive definiteness.
Let \(A,B \in \R^{N \times N}\) with \(A\) positive definite and \(B\) invertible, and set \(y=Bx\text{.}\)
(a)
By using the chain rule, show that \(\partial_{x_i} = B_{ki} \partial_{y_k}\) and \(\partial_{x_i} \partial_{x_j} = B_{ki} B_{\ell j} \partial_{y_k} \partial_{y_\ell}\text{.}\)
(b)
Use Theorem 2.28 to find an invertible matrix \(B\) such that \(BAB^\top = I\) is the identity matrix.
(c)
Deduce that, with this choice of \(B\text{,}\)
\begin{equation*}
A_{ij} \partial_{x_i} \partial_{x_j} = \partial_{y_k} \partial_{y_k} = \Delta_y\text{.}
\end{equation*}
In other words, in the \(y\) variables the operator \(A_{ij}
\partial_{ij}\) becomes the Laplacian.