Image Compression

Singular Value Decomposition

The Singular Value Decomposition (SVD) states that for every matrix $A$ of size $m\times n$ there exist two sets of vectors, $\{v_1, v_2,\cdots, v_n\}$ orthogonal in $\mathbb{R}^{n},$ $\{u_1, u_2, \cdots, u_m\}$ orthogonal in $\mathbb{R}^m$ and real numbers $\sigma_1\geq\sigma_2\geq\cdots\geq \sigma_r>0$ such that


$$Av_i = \sigma_i u_i,\quad \forall i=1, \cdots r,$$

where $r$ is the rank of the matrix $A$. And


$$Av_j = 0\quad \forall j=r+1,\cdots, n.$$

In matricial notation:


$$A \left[\begin{array}{ccccc} &&&&\\ &&&&\\ \boldsymbol{v_1} & \hspace{-0.2cm}\cdots &\hspace{-0.2cm}\boldsymbol{v_r} & \hspace{-0.2cm}\cdots & \hspace{-0.2cm}\boldsymbol{v_n}\\ &&&&\\ &&&& \end{array}\right]=\left[\begin{array}{ccccc} &&&&\\ &&&&\\ \boldsymbol{u_1} & \hspace{-0.2cm}\cdots &\hspace{-0.2cm}\boldsymbol{u_r} & \hspace{-0.2cm}\cdots & \hspace{-0.2cm}\boldsymbol{u_m}\\ &&&&\\ &&&& \end{array}\right]\left[\begin{array}{ccc|c} \sigma_1 & & & \\ & \ddots & & 0\\ & & \sigma_r & \\ \hline & 0 & & 0 \end{array}\right]$$

That is


$$AV = U\Sigma$$

As the vectors in the set $\{v_1, v_2,\cdots, v_n\}$ are orthonormal then the matrix $V$ with columns $v_i$ is orthogonal, that is $VV^{t} = I_n$. Using this and multiplying by $V^T$ on both sides of the expression we get

$$A = U\Sigma V^T$$

The SVD of the matrix $A$.

Image compression and the Eckart-Young-Mirsky Theorem

Given a matrix $A$ of size $m\times n$ and rank $r$, we know that for every $i>r$ $Av_i=0$, so there are parts in $AV = U\Sigma$ that don't contribute to the multiplication. The important part in the product of those matrices is in the first $r$ sigma vectors and sigma values, this way we get


$$A \left[\begin{array}{ccc} &&\\ \boldsymbol{v_1} & \hspace{-0.2cm}\cdots & \hspace{-0.2cm}\boldsymbol{v_r}\\ && \end{array}\right]=\left[\begin{array}{ccc} &&\\ \boldsymbol{u_1} & \hspace{-0.2cm}\cdots & \hspace{-0.2cm}\boldsymbol{u_r}\\ && \end{array}\right]\left[\begin{array}{ccc} \sigma_1 & & \\ & \ddots & \\ & & \sigma_r \end{array}\right]$$

And we still have $V_rV_r^T = I_r$ so multiplying on both sides by $V_r^T$ we get to the reduced SVD decomposition of $\boldsymbol{A}$

$$A_r = U_r\Sigma_r V_r^T$$

The Eckart-Young-Mirsky theorem states that for every unitarily invariant norm $||\cdot||$ then the best $k$-rank approximation of a matrix $A$ is $A_k$, that is

$$\min_{r(B)=k}||A-B|| = ||A-A_k||.$$

This is particularly useful in data science and we shall show how good this approximations are by compressing an image. An image is a 2-dimensional array of pixels, each pixel, represented with a square is also an array formed by four numerical values RGBA, this values represent "how much" Red, Green, Blue and Alpha (alpha is the transparency of the image) there is in that particular pixel, this values range from $0$ to $255$.

An image can be converted into greyscale by setting the three values of RGB to the average of them in the original image, that is setting for each pixel the array $\left[\frac{R+G+B}{3}, \frac{R+G+B}{3},\frac{R+G+B}{3}\right]$, doing so allows us to transform an image into a big matrix in which every value represents the average of colours in the corresponding pixel.


Using what we've just learned about the SVD, we know due to the Eckart-Young-Mirsky theorem that the image matrix can be very well aproximated by one of smaller range, making the image lighter. This is what we do in the Image Compression section.