Problem
Note: I was brushing up on my SVD using the brilliant “Mathematics for Machine Learning” book, but the exercises listed in the book for SVD were a bit basic, so I decided to try use ChatGPT to generate a question. The problem below is what came out, quite impressive, although there were quite a few errors in the question that I had to fix (e.g the dimensionalities of the matrices).
Let \( A \) be an \( m \times n \) with \( m\geq n \) matrix with rank \( r \), and let its singular value decomposition (SVD) be given by \( A = U \Sigma V^T \), where \( U \) is an \( m \times m \) orthonormal matrix, \( \Sigma \) is an \( m \times n \) diagonal matrix with non-negative entries \( \sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_n > 0 \), and \( V \) is an \( n \times n \) orthonormal matrix. Show that the Frobenius norm of \( A \) is equal to the square root of the sum of the squares of the singular values.
Source: ChatGPT, Mathematics for Machine Learning by Deisenroth, Faisal and Ong
Solution
The Frobenius norm is given by,
$$ ||A||^2_F = \sum_{i,j}A^2_{i,j} $$which we can equivalently write (it is not so hard to show this) as
$$ ||A||^2_F = \operatorname{tr}(A^\top A). $$Subbing in the SVD representation of \( A \), we get \begin{align} \operatorname{tr}(A^\top A) &= \operatorname{tr}\left( (U\Sigma V^\top)^\top(U\Sigma V^\top)\right)\\\ &= \operatorname{tr}\left(V\Sigma^\top U^\top U\Sigma V^\top\right) \\\ &= \operatorname{tr}\left(V\Sigma^\top \Sigma V^\top\right). \end{align} We can see by inspection that \( \Sigma^\top\Sigma = \operatorname{diag}(\sigma^2_1, \dots, \sigma^2_n)=\Lambda \). This implies that
$$ [V\Lambda V^\top]_{ii} = \sigma_i^2\mathbf{v}^\top_i\mathbf{v}_i = \sigma_i^2, $$where \( \mathbf{v}_i \) is column \( i \) of \( V \) and the second equality follows from the othonormality of \( V \). Putting it all together we get that
$$ ||A||^2_F = \operatorname{tr}(A^\top A) = \sum_i \sigma_i^2, $$as required.