Exercises

\(\newcommand{\bmu}{\boldsymbol{\mu}}\) \(\newcommand{\bSigma}{\boldsymbol{\Sigma}}\) \(\newcommand{\bfbeta}{\boldsymbol{\beta}}\) \(\newcommand{\bflambda}{\boldsymbol{\lambda}}\) \(\newcommand{\bgamma}{\boldsymbol{\gamma}}\) \(\newcommand{\bsigma}{{\boldsymbol{\sigma}}}\) \(\newcommand{\bpi}{\boldsymbol{\pi}}\) \(\newcommand{\btheta}{{\boldsymbol{\theta}}}\) \(\newcommand{\bphi}{\boldsymbol{\phi}}\) \(\newcommand{\balpha}{\boldsymbol{\alpha}}\) \(\newcommand{\blambda}{\boldsymbol{\lambda}}\) \(\renewcommand{\P}{\mathbb{P}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\indep}{\perp\!\!\!\perp} \newcommand{\bx}{\mathbf{x}}\) \(\newcommand{\bp}{\mathbf{p}}\) \(\renewcommand{\bx}{\mathbf{x}}\) \(\newcommand{\bX}{\mathbf{X}}\) \(\newcommand{\by}{\mathbf{y}}\) \(\newcommand{\bY}{\mathbf{Y}}\) \(\newcommand{\bz}{\mathbf{z}}\) \(\newcommand{\bZ}{\mathbf{Z}}\) \(\newcommand{\bw}{\mathbf{w}}\) \(\newcommand{\bW}{\mathbf{W}}\) \(\newcommand{\bv}{\mathbf{v}}\) \(\newcommand{\bV}{\mathbf{V}}\) \(\newcommand{\bfg}{\mathbf{g}}\) \(\newcommand{\bfh}{\mathbf{h}}\) \(\newcommand{\horz}{\rule[.5ex]{2.5ex}{0.5pt}}\) \(\renewcommand{\S}{\mathcal{S}}\) \(\newcommand{\X}{\mathcal{X}}\) \(\newcommand{\var}{\mathrm{Var}}\) \(\newcommand{\pa}{\mathrm{pa}}\) \(\newcommand{\Z}{\mathcal{Z}}\) \(\newcommand{\bh}{\mathbf{h}}\) \(\newcommand{\bb}{\mathbf{b}}\) \(\newcommand{\bc}{\mathbf{c}}\) \(\newcommand{\cE}{\mathcal{E}}\) \(\newcommand{\cP}{\mathcal{P}}\) \(\newcommand{\bbeta}{\boldsymbol{\beta}}\) \(\newcommand{\bLambda}{\boldsymbol{\Lambda}}\) \(\newcommand{\cov}{\mathrm{Cov}}\) \(\newcommand{\bfk}{\mathbf{k}}\) \(\newcommand{\idx}[1]{}\) \(\newcommand{\xdi}{}\)

4.7. Exercises#

4.7.1. Warm-up worksheets#

(with help from Claude, Gemini, and ChatGPT)

Section 4.2

E4.2.1 Compute the rank of the matrix \(A = \begin{pmatrix} 1 & 2 & 3\\ 0 & 1 & 1\\ 1 & 3 & 4 \end{pmatrix}\).

E4.2.2 Let \(A = \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix}\) and \(B = \begin{pmatrix} 1 & 0\\ 0 & 1 \end{pmatrix}\). Compute \(\mathrm{rk}(A)\), \(\mathrm{rk}(B)\), and \(\mathrm{rk}(A+B)\).

E4.2.3 Find the eigenvalues and corresponding eigenvectors of the matrix \(A = \begin{pmatrix} 3 & 1\\ 1 & 3 \end{pmatrix}\).

E4.2.4 Verify that the eigenvectors from E4.2.3 are orthogonal.

E4.2.5 Write the spectral decomposition of the matrix \(A\) from E4.2.3.

E4.2.6 Determine if the matrix \(A = \begin{pmatrix} 2 & -1\\ -1 & 2 \end{pmatrix}\) is positive definite, positive semidefinite, or neither.

E4.2.7 Compute the outer product \(\mathbf{u}\mathbf{v}^T\) for \(\mathbf{u} = (1, 2, 3)\) and \(\mathbf{v} = (4, 5)\).

E4.2.8 Write the matrix \(A = \begin{pmatrix} 1 & 2 & 3\\ 2 & 4 & 6\\ 3 & 6 & 9 \end{pmatrix}\) as a sum of rank-one matrices.

E4.2.9 Let \(A = \begin{pmatrix} 1 & 2 \\ 3 & 6 \end{pmatrix}\). What is the rank of \(A\)?

E4.2.10 Let \(u = \begin{pmatrix} 1 \\ 2 \end{pmatrix}\) and \(v = \begin{pmatrix} 3 \\ 4 \end{pmatrix}\). Compute the outer product \(uv^T\).

E4.2.11 Let \(A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}\). Find the eigenvalues and eigenvectors of \(A\).

E4.2.12 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix}\). Verify that \(v = \begin{pmatrix} 1 \\ 1 \end{pmatrix}\) is an eigenvector of \(A\) and find the corresponding eigenvalue.

E4.2.13 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\). Find a basis for the column space of \(A\).

E4.2.14 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\). Find a basis for the null space of \(A\).

E4.2.15 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix}\). Is \(A\) positive semidefinite?

E4.2.16 Determine if the function \(f(x, y) = x^2 + y^2 + xy\) is convex.

E4.2.17 Is the function \(f(x, y) = x^2 - y^2\) convex?

E4.2.18 Check the convexity of the function \(f(x, y) = e^{x+y}\).

E4.2.19 Check the convexity of the function \(f(x, y) = \log(x^2 + y^2 + 1)\).

E4.2.20 Is the function \(f(x, y) = xy\) convex?

Section 4.3

E4.3.1 Let \(\boldsymbol{\alpha}_1 = (1, 2)\) and \(\boldsymbol{\alpha}_2 = (-2, 1)\). Find a unit vector \(\mathbf{w}_1\) in \(\mathbb{R}^2\) that maximizes \(\|A\mathbf{w}_1\|^2\), where \(A\) is the matrix with rows \(\boldsymbol{\alpha}_1^T\) and \(\boldsymbol{\alpha}_2^T\).

E4.3.2 Let \(\mathbf{u} = (1/\sqrt{2}, 1/\sqrt{2})\) and \(\mathbf{v} = (1/\sqrt{2}, -1/\sqrt{2})\). Compute the outer product \(\mathbf{u}\mathbf{v}^T\).

E4.3.3 Let \(A = \begin{pmatrix} 3 & 0 \\ 0 & 2 \end{pmatrix}\). Find an SVD of \(A\).

E4.3.4 Let \(A = \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}\). Find a compact SVD of \(A\).

E4.3.5 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\). Find the eigenvalues and eigenvectors of \(A^TA\).

E4.3.6 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\) as in E4.3.5. Find the eigenvalues and eigenvectors of \(AA^T\).

E4.3.7 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\) as in E4.3.5. Find a compact SVD of \(A\).

E4.3.8 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\) as in E4.3.5. Find a full SVD of \(A\).

E4.3.9 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\) as in E4.3.5. Find an orthonormal basis for each of the four fundamental subspaces of \(A\).

E4.3.10 Given the data points \(\boldsymbol{\alpha}_1 = (1,2)\), \(\boldsymbol{\alpha}_2 = (2,1)\), \(\boldsymbol{\alpha}_3 = (-1,1)\), and \(\boldsymbol{\alpha}_4 = (3,-1)\), compute the matrix \(A\) and the matrices \(A^TA\) and \(AA^T\).

E4.3.11 Find the eigenvalues and eigenvectors of the matrix \(A^TA\) from E4.3.10.

E4.3.12 Compute a compact SVD of the matrix \(A\) from E4.3.10.

E4.3.13 Compute the matrix \(U_1 \Sigma_1 V_1^T\) from E4.3.12 and verify that it is equal to \(A\) from E4.3.10.

E4.3.14 Based on E4.3.12, verify the relations \(A^T \mathbf{u}_i = \sigma_i \mathbf{v}_i\) for \(i=1,2\).

E4.3.15 Based on E4.3.12, verify the relations \(A^TA\mathbf{v}_i = \sigma_i^2 \mathbf{v}_i\) and \(AA^T\mathbf{u}_i = \sigma_i^2 \mathbf{u}_i\) for \(i=1,2\).

E4.3.16 Based on E4.3.12, find the best approximating subspace of dimension \(k=2\) and compute the sum of squared distances to this subspace.

E4.3.17 Based on E4.3.12, find the best approximating subspace of dimension \(k=1\) and compute the sum of squared distances to this subspace.

E4.3.18 Based on E4.3.17, compute the matrix \(Z\) obtained from the truncated SVD with \(k=1\).

Section 4.4

E4.4.1 Let \(A = \begin{pmatrix} 3 & 0 \\ 0 & 1 \end{pmatrix}\). Compute \(A^2\), \(A^3\), and \(A^4\). What pattern do you observe?

E4.4.2 Let \(B = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}\). Compute \(B^2\) and \(B^3\). If you were to continue computing higher powers of \(B\), what would you expect to happen to the entries?

E4.4.3 Given the symmetric matrix \(A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}\) and the vector \(\mathbf{x} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}\), compute \(A^k \mathbf{x}\) for \(k = 1, 2, 3\) and \(\frac{A^k \mathbf{x}}{\|A^k \mathbf{x}\|}\).

E4.4.4 Given the symmetric matrix \(A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}\) and the vector \(\mathbf{x} = \begin{pmatrix} 0 \\ 1 \end{pmatrix}\), compute \(A^k \mathbf{x}\) for \(k = 1, 2, 3\) and \(\frac{A^k \mathbf{x}}{\|A^k \mathbf{x}\|}\).

E4.4.5 Given the matrix \(A = \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix}\) and the vector \(\mathbf{x} = \begin{pmatrix} 1 \\ 1 \end{pmatrix}\), compute \(A^k \mathbf{x}\) for \(k = 1, 2, 3\).

E4.4.6 Let \(A = \begin{pmatrix} 4 & 1 \\ 1 & 4 \end{pmatrix}\). Find the eigenvalues and eigenvectors of \(A\).

E4.4.7 Let \(A\) be as in E4.4.6. Compute \(A^2\) and \(A^3\) using the spectral decomposition of \(A\).

E4.4.8 Let \(A\) be as in E4.4.6. Let \(\mathbf{x} = \begin{pmatrix} 2 \\ 1 \end{pmatrix}\). Compute \(A^2 \mathbf{x}\) and \(A^3 \mathbf{x}\). What do you notice about the direction of these vectors as the power increases?

E4.4.9 Let \(A = \begin{pmatrix} 1 & 2 \\ 0 & 1 \end{pmatrix}\). Compute \(A^TA\). Determine whether \(A^TA\) is positive semidefinite by computing its eigenvalues.

Section 4.5

E4.5.1 Given a loading vector \(\boldsymbol{\phi}_1 = \left(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}\right)\), verify that it satisfies the unit norm constraint.

E4.5.2 Given two loading vectors \(\boldsymbol{\phi}_1 = \left(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}\right)\) and \(\boldsymbol{\phi}_2 = \left(\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}}\right)\), verify that they are orthogonal.

E4.5.3 Given a centered data matrix \(\tilde{X} = \begin{pmatrix} -1 & 1 \\ 1 & -1 \end{pmatrix}\) and a loading vector \(\boldsymbol{\phi}_1 = \left(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}\right)\), compute the first principal component scores \(t_{i1}\).

E4.5.4 Given a centered data matrix \(\tilde{X} = \begin{pmatrix} -1 & 1 \\ 1 & -1 \end{pmatrix}\) and two loading vectors \(\boldsymbol{\phi}_1 = \left(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}\right)\) and \(\boldsymbol{\phi}_2 = \left(\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}}\right)\), compute the first and second principal component scores \(t_{i1}\) and \(t_{i2}\).

E4.5.5 Given the first and second principal component scores from E4.5.4, verify that they are uncorrelated.

E4.5.6 Given a data matrix \(X = \begin{pmatrix} 1 & 3 \\ -1 & -3\end{pmatrix}\), compute the centered data matrix \(\tilde{X}\).

E4.5.7 Given the data matrix

\[\begin{split} X = \begin{pmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{pmatrix}, \end{split}\]

compute the mean-centered data matrix.

E4.5.8 Given the SVD of the centered data matrix from E4.5.7, extract the first principal component loading vector \(\boldsymbol{\phi}_1\).

E4.5.9 Given the centered data matrix \(\tilde{X}\) and the first principal component loading vector \(\boldsymbol{\phi}_1\) from E4.5.7 and E4.5.8, compute the first principal component scores \(t_{i1}\).

E4.5.10 Given a data matrix \(X\) with centered columns, show that the sample covariance matrix of \(X\) can be expressed as \(\frac{1}{n-1} X^T X\).

E4.5.11 Given the loading vector \(\boldsymbol{\phi}_1 = (0.8, 0.6)\) for the first principal component, find a loading vector \(\boldsymbol{\phi}_2\) for the second principal component.

E4.5.12 Given the data points \(\mathbf{x}_1 = (1, 0)\), \(\mathbf{x}_2 = (0, 1)\), and \(\mathbf{x}_3 = (-1, 0)\), compute the first principal component vector.

Section 4.6

E4.6.1 Compute the Frobenius norm of the matrix \(A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{pmatrix}\).

E4.6.2 Compute the SVD of \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\).

E4.6.3 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\). Compute the Frobenius norm \(\|A\|_F\).

E4.6.4 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\). Compute the induced 2-norm \(\|A\|_2\).

E4.6.5 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\) and \(B = \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}\). Compute \(\|A - B\|_F\).

E4.6.6 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\) and let \(A_1 = \sigma_1 \mathbf{u}_1 \mathbf{v}_1^T\) be the rank-1 truncated SVD of \(A\). Compute \(\|A - A_1\|_F\).

E4.6.7 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\) and let \(A_1 = \sigma_1 \mathbf{u}_1 \mathbf{v}_1^T\) be the rank-1 truncated SVD of \(A\). Compute \(\|A - A_1\|_2\).

E4.6.8 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\), \(\mathbf{b} = \begin{pmatrix} 5 \\ 10 \end{pmatrix}\), and \(\lambda = 1\). Compute the ridge regression solution \(\mathbf{x}^{**}\) to \(\min_{\mathbf{x} \in \mathbb{R}^2} \|A \mathbf{x} - \mathbf{b}\|_2^2 + \lambda \|\mathbf{x}\|_2^2\).

E4.6.9 Let \(A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}\). Compute \(\|A^{-1}\|_2\).

E4.6.10 Compute the ridge regression solution for \(A = \begin{pmatrix} 1 & 1 \\ 1 & 2 \\ 1 & 3 \end{pmatrix}\), \(b = \begin{pmatrix} 1 \\ 2 \\ 2 \end{pmatrix}\), and \(\lambda = 1\).

4.7.2. Problems#

4.1 Let \(Q \in \mathbb{R}^{n \times n}\) be an orthogonal matrix. Use the SVD via Spectral Decomposition Theorem to compute an SVD of \(Q\). Is there a difference between the compact and full SVD in this case? \(\lhd\)

4.2 Let \(W \subseteq \mathbb{R}^m\) be a hyperplane. Show that there exists a unit vector \(\mathbf{z} \in \mathbb{R}^m\) such that

\[ \mathbf{w} \in W \iff \langle \mathbf{z}, \mathbf{w} \rangle = 0. \]

\(\lhd\)

4.3 Construct a matrix \(A \in \mathbb{R}^{n \times n}\) for which there exist multiple solutions to the maximization problem

\[ \mathbf{v}_1 \in \arg\max\{\|A \mathbf{v}\|:\|\mathbf{v}\| = 1\}. \]

\(\lhd\)

4.4 Prove an analogue of the Characterization of Positive Semidefiniteness for positive definite matrices. \(\lhd\)

4.5 (Adapted from [Sol]) Show that \(\|A\|_2 = \|\Sigma\|_2\), where \(A = U \Sigma V^T\) is a singular value decomposition of A. \(\lhd\)

4.6 Let \(A\), \(U \Sigma V^T\), \(B\) be as in the Power Iteration Lemma. Assume further that \(\sigma_2 > \sigma_3\) and that \(\mathbf{y} \in \mathbb{R}^m\) satisfies both \(\langle \mathbf{v}_1, \mathbf{y} \rangle = 0\) and \(\langle \mathbf{v}_2, \mathbf{y} \rangle > 0\). Show that \(\frac{B^{k} \mathbf{y}}{\|B^{k} \mathbf{y}\|} \to \mathbf{v}_2\) as \(k \to +\infty\). How would you find such a \(\mathbf{y}\)? \(\lhd\)

4.7 Let \(A \in \mathbb{R}^{n \times m}\). Use the Cauchy-Schwarz Inequality to show that

\[ \|A\|_2 = \max \left\{ \mathbf{x}^T A \mathbf{y} \,:\, \|\mathbf{x}\| = \|\mathbf{y}\| = 1 \right\}. \]

\(\lhd\)

4.8 Let \(A \in \mathbb{R}^{n \times m}\).

a) Show that \(\|A\|_F^2 = \sum_{j=1}^m \|A \mathbf{e}_j\|^2\).

b) Use (a) and the Cauchy-Schwarz Inequality to show that \(\|A\|_2 \leq \|A\|_F\).

c) Give an example such that \(\|A\|_F = \sqrt{n} \|A\|_2\). \(\lhd\)

4.9 Use the Cauchy-Schwarz Inequality to show that for any \(A, B\) for which \(AB\) is well-defined it holds that

\[ \|A B \|_F \leq \|A\|_F \|B\|_F. \]

\(\lhd\)

4.10 (Note: Refers to the online supplementary materials.) Let \(A \in \mathbb{R}^{n \times n}\) be nonsingular with SVD \(A = U \Sigma V^T\) where the singular values satisfy \(\sigma_1 \geq \cdots \geq \sigma_n > 0\). Show that

\[ \min_{\mathbf{x} \neq \mathbf{0}} \frac{\|A \mathbf{x}\|}{\|\mathbf{x}\|} = \min_{\mathbf{y} \neq \mathbf{0}} \frac{\|\mathbf{y}\|}{\|A^{-1}\mathbf{y}\|} = \sigma_n = 1/\|A^+\|_2. \]

\(\lhd\)

4.11 Let \(X \in \mathbb{R}^{n \times d}\) be a matrix with rows \(\mathbf{x}_1^T,\ldots,\mathbf{x}_n^T\). Write the following sum in matrix form in terms of \(X\):

\[ \frac{1}{n} \sum_{i=1}^n \mathbf{x}_i \mathbf{x}_i^T. \]

Justify your answer. \(\lhd\)

4.12 Let \(A \in \mathbb{R}^{d \times d}\) be a symmetric matrix. Show that \(A \preceq M I_{d \times d}\) if and only if the eigenvalues of \(A\) are at most \(M\). Similarly, \(m I_{d \times d} \preceq A\) if and only if the eigenvalues of \(A\) are at least \(m\). [Hint: Observe that the eigenvectors of \(A\) are also eigenvectors of the identity matrix \(I_{d \times d}\).] \(\lhd\)

4.13 Prove the Power Iteration in the Positive Semidefinite Case Lemma. \(\lhd\)

4.14 Recall that the trace \(\mathrm{tr}(A)\) of matrix \(A = (a_{i,j})_{i,j} \in \mathbb{R}^n\) is the sum of its diagonal entries, i.e., \(\mathrm{tr}(A) = \sum_{i=1}^n a_{i,i}\).

a) Show that, for any \(A \in \mathbb{R}^{n \times m}\) and \(B \in \mathbb{R}^{m \times n}\), it holds that \(\mathrm{tr}(AB) = \mathrm{tr}(BA)\).

b) Use a) to show that, if a symmetric matrix \(A\) has spectral decomposition \(\sum_{i=1}^n \lambda_i \mathbf{q}_i \mathbf{q}_i^T\), then

\[ \mathrm{tr}(A) = \sum_{i=1}^n \lambda_i. \]

\(\lhd\)

4.15 (Adapted from [Sol]) Suppose \(A \in \mathbb{R}^{m \times n}\) and \(B \in \mathbb{R}^{n \times m}\). Show \(\|A\|^2_F = \mathrm{tr}(A^T A)\) and \(\mathrm{tr}(A B) = \mathrm{tr}(B A)\), where recall that the trace of a matrix \(C\), denoted \(\mathrm{tr}(C)\), is the sum of the diagonal elements of \(C\). \(\lhd\)

4.16 (Adapted from [Str]) Let \(A \in \mathbb{R}^{n \times k}\) and \(B \in \mathbb{R}^{k \times m}\). Show that the null space of \(AB\) contains the null space of \(B\). \(\lhd\)

4.17 (Adapted from [Str]) Let \(A \in \mathbb{R}^{n \times m}\). Show that \(A^T A\) has the same null space as \(A\). \(\lhd\)

4.18 (Adapted from [Str]) Let \(A \in \mathbb{R}^{m \times r}\) and \(B \in \mathbb{R}^{r \times n}\), both of rank \(r\).

a) Show that \(A^T A\), \(B B^T\) and \(A^T A B B^T\) are all invertible.

b) Show that \(r = \mathrm{rk}(A^T A B B^T) \leq \mathrm{rk}(AB)\).

c) Conclude that \(\mathrm{rk}(AB) = r\).

\(\lhd\)

4.19 For any \(n \geq 1\), give an example of a matrix \(A \in \mathbb{R}^{n \times n}\) with \(A \neq I_{n \times n}\) such that \(A^2 = I_{n \times n}\). \(\lhd\)

4.20 Let \(A \in \mathbb{R}^{n \times k}\) and \(B \in \mathbb{R}^{r \times m}\), and let \(D \in \mathbb{R}^{k \times r}\) be a diagonal matrix. Assume \(k < r\). Compute the columns of \(AD\) and the rows of \(DB\). \(\lhd\)

4.21 Compute an SVD of \(A = [-1]\). \(\lhd\)

4.22 We show in this exercise that a matrix can have many distinct SVDs. Let \(Q, W \in \mathbb{R}^{n \times n}\) be orthogonal matrices. Give four different SVDs for \(Q\), one where \(V = I\), one where \(V = Q\), one where \(V = Q^T\) and one where \(V = W\). Make sure to check all requirements of the SVD. Do the singular values change? \(\lhd\)

4.23 (Adapted from [Sol]) Show that adding a row to a matrix cannot decrease its largest singular value. \(\lhd\)

4.24 Let \(\mathbf{v}_1,\ldots,\mathbf{v}_n \in \mathbb{R}^n\) be an orthonormal basis. Fix \(1 \leq k < n\). Let \(Q_1\) be the matrix with columns \(\mathbf{v}_1,\ldots,\mathbf{v}_k\) and let \(Q_2\) be the matrix with columns \(\mathbf{v}_{k+1},\ldots,\mathbf{v}_n\). Show that

\[ Q_1 Q_1^T + Q_2 Q_2^T = I_{n \times n}. \]

[Hint: Multiply both sides by \(\mathbf{e}_i\).] \(\lhd\)

4.25 (Note: Refers to the online supplementary materials.) Let \(Q \in \mathbb{R}^{n \times n}\) be an orthogonal matrix. Compute its condition number

\[ \kappa_2(Q) = \|Q\|_2 \|Q^{-1}\|_2. \]

\(\lhd\)

4.26 Prove the SVD Relations Lemma. \(\lhd\)

4.27 Let \(A = \sum_{j=1}^r \sigma_j \mathbf{u}_j \mathbf{v}_j^T\) be an SVD of \(A \in \mathbb{R}^{n \times m}\) with \(\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0\).

Define

\[ B = A - \sigma_1 \mathbf{u}_1 \mathbf{v}_1^T. \]

Show that

\[ \mathbf{v}_2 \in \arg\max\{\|B \mathbf{v}\|:\|\mathbf{v}\| = 1\}. \]

\(\lhd\)

4.28 (Adapted from [Sol]) The stable rank of \(A \in \mathbb{R}^{n \times n}\) is defined as

\[ \mathrm{StableRk}(A) = \frac{\|A\|_F^2}{\|A\|_2^2}. \]

a) Show that, if all columns of \(A\) are the same nonzero vector \(\mathbf{v} \in \mathbb{R}^n\), then \(\mathrm{StableRk}(A) = 1\).

b) Show that, when the columns of \(A\) are orthonormal, then \(\mathrm{StableRk}(A) = n\).

c) Is \(\mathrm{StableRk}(A)\) always an integer? Justify your answer.

d) More generally, show that

\[ 1 \leq \mathrm{StableRk}(A) \leq n. \]

e) Show that, in general, it holds that

\[ \mathrm{StableRk}(A) \leq \mathrm{Rk}(A). \]

[Hint: Use the Matrix Norms and Singular Values Lemma.] \(\lhd\)

4.29 Let \(A \in \mathbb{R}^{n \times n}\) be a square matrix with full SVD \(A = U \Sigma V^T\).

a) Justify the following formula

\[ A = (U V^T) (V \Sigma V^T). \]

b) Let

\[ Q = U V^T, \qquad S = V \Sigma V^T. \]

Show that \(Q\) is orthogonal and that \(S\) is positive semidefinite. A factorization of the form \(A = Q S\) is called a polar decomposition. \(\lhd\)

4.30 Let \(A \in \mathbb{R}^{n \times m}\) be a matrix with full SVD \(A = U \Sigma V^T\) where the singular values satisfy \(\sigma_1 \geq \cdots \geq \sigma_m \geq 0\). Define the unit sphere in \(m\) dimensions \(\mathbb{S}^{m-1} = \{\mathbf{x} \in \mathbb{R}^m\,:\,\|\mathbf{x}\| = 1\}\). Show that

\[ \sigma_m^2 = \min_{\mathbf{x} \in \mathbb{S}^{m-1}} \|A \mathbf{x}\|^2. \]

\(\lhd\)

4.31 Let \(A \in \mathbb{R}^{n \times m}\) be a matrix with columns \(\mathbf{a}_1,\ldots,\mathbf{a}_m\). Show that for all \(i\)

\[ \|\mathbf{a}_i\|_2 \leq \|A\|_2. \]

\(\lhd\)

4.32 Let \((U_i, V_i)\), \(i = 1, \ldots, n\), be i.i.d. random vectors taking values in \(\mathbb{R}^2\). Assume that \(\mathbb{E}[U_1^2], \mathbb{E}[V_1^2] < +\infty\).

a) Show that

\[ \mathbb{E}\left[\frac{1}{n} \sum_{i=1}^n U_i \right] = \mathbb{E}[U_1]. \]

b) Show that

\[ \mathbb{E}\left[\frac{1}{n-1} \sum_{i=1}^n \left(U_i - \frac{1}{n} \sum_{j=1}^n U_j\right)^2\right] = \mathrm{Var}[U_1]. \]

c) Show that

\[ \mathbb{E}\left[\frac{1}{n-1} \sum_{i=1}^n \left(U_i - \frac{1}{n} \sum_{j=1}^n U_j\right) \left(V_i - \frac{1}{n} \sum_{k=1}^n V_k\right) \right] = \mathrm{Cov}[U_1, V_1]. \]

In words, the left-hand side of each equation above (inside the expectation to be more precise) is a so-called unbiased estimator of the right-hand side. Those estimators are called respectively the sample mean, the sample variance and the sample covariance. \(\lhd\)

4.33 Give an equivalent matrix representation of the best \(k\)-dimensional approximating subspace in terms of the data matrix \(A \in \mathbb{R}^{n \times m}\) and a matrix \(W\) whose columns are orthonormal. [Hint: Use the Frobenius norm.] \(\lhd\)

4.34 Let \(\mathbf{q}_1,\ldots,\mathbf{q}_m\) be an orthonormal list in \(\mathbb{R}^n\) and consider the matrix

\[ M = \sum_{i=1}^m \mathbf{q}_i \mathbf{q}_i^T. \]

a) Suppose \(m = n\). Show that \(M = I_{n \times n}\) by computing \(\mathbf{e}_i^T M \mathbf{e}_j\) for all \(i,j\).

b) Suppose \(m \leq n\). What is the geometric interpretation of \(M\)? Based on your answer, give a second proof of (a).

\(\lhd\)

4.35 Let \(A \in \mathbb{R}^{n \times m}\) be a matrix with compact SVD \(A = U \Sigma V^T\). Let \(A^+ = V \Sigma^{-1} U^T\) (known as a pseudoinverse). Show that \(A A^+ A = A\) and \(A^+ A A^+ = A^+\). \(\lhd\)

4.36 Let \(A \in \mathbb{R}^{n \times n}\) be a symmetric matrix with orthonormal eigenvectors \(\mathbf{q}_1,\ldots,\mathbf{q}_n\) and correponding eigenvalues \(\lambda_1 \geq \cdots \geq \lambda_n\).

a) Compute an spectral decomposition of \(A^T A\).

b) Use (a) to compute \(\|A\|_2\) in terms of two of the eigenvalues of \(A\). \(\lhd\)

Exercises

Contents

4.7. Exercises#

4.7.1. Warm-up worksheets#

4.7.2. Problems#