$$A=\begin{bmatrix} a & b & c & d & e \\ f & a & b & c & d \\ g & f & a & b & c \\ h & g & f & a & b \\ i & h & g & f & a\end{bmatrix}$$
Generally, the elements are denoted as $a_{i-j}$: $A_{ij}=A_{i+1, j+1}=a_{i-j}$. A circulant matrix is a special kind of Toeplitz matrix where each row vector is rotated one element:
$$C=\begin{bmatrix}c_{0}&c_{n-1}&\dots &c_{2}&c_{1}\\c_{1}&c_{0}&c_{n-1}&&c_{2}\\\vdots &c_{1}&c_{0}&\ddots &\vdots \\c_{n-2}&&\ddots &\ddots &c_{n-1}\\c_{n-1}&c_{n-2}&\dots &c_{1}&c_{0}\\\end{bmatrix}$$
Circulant matrices are closely connected to discrete convolution:
$$y=h\ast x=\begin{bmatrix}h_{1}&0&\cdots &0&0\\h_{2}&h_{1}&&\vdots &\vdots \\h_{3}&h_{2}&\cdots &0&0\\\vdots &h_{3}&\cdots &h_{1}&0\\h_{m-1}&\vdots &\ddots &h_{2}&h_{1}\\h_{m}&h_{m-1}&&\vdots &h_{2}\\0&h_{m}&\ddots &h_{m-2}&\vdots \\0&0&\cdots &h_{m-1}&h_{m-2}\\\vdots &\vdots &&h_{m}&h_{m-1}\\0&0&0&\cdots &h_{m}\end{bmatrix}
\begin{bmatrix}x_{1}\\x_{2}\\x_{3}\\\vdots \\x_{n}\end{bmatrix}$$
The above matrix is $(m,n)$, the size of full version of the matrix of RHS should be $(m+n-1,m+n-1)$ and the circulant matrix is generated by an $m+n-1$ row vector $(h_1, h_2,...,h_m,0,0,...,0)$, the full version of vector of $x$ of RHS should also be an $m+n-1$ vector $(x_1, x_2,...,x_n,0,...,0)^T$. For $(a \ast v)[n] = \sum_{m = -\infty}^{\infty} a[m] v[n - m]$(from
help(np.convolve)
). If the original arrays are of size $m, n$, then the output array is $m+n-1$. Here I present a simple proof of discrete convolution theorem using DFT:$y=h\ast x=\mathrm{IDFT}\{\mathrm{DFT}\{h\}\mathrm{DFT}\{x\}\}$
Proof:
the DFT matrix is:
$$W=\frac {1}{\sqrt {N}}\begin{bmatrix}1&1&1&1&\cdots &1\\1&\omega &\omega ^{2}&\omega ^{3}&\cdots &\omega ^{N-1}\\1&\omega ^{2}&\omega ^{4}&\omega ^{6}&\cdots &\omega ^{2(N-1)}\\1&\omega ^{3}&\omega ^{6}&\omega ^{9}&\cdots &\omega ^{3(N-1)}\\\vdots &\vdots &\vdots &\vdots &&\vdots \\1&\omega ^{N-1}&\omega ^{2(N-1)}&\omega ^{3(N-1)}&\cdots &\omega ^{(N-1)(N-1)}\\\end{bmatrix}$$
With $W_{ij}=\omega^{i\times j}$ and $\omega=\exp(\frac{2\pi \mathbb{i}}{N})$. Assume $\hat{h}=\mathrm{DFT}\{h\}=W\cdot h$, and assume $\hat{c}_j=\mathrm{DFT}\{h\}_j\mathrm{DFT}\{x\}_j$, we have:
$$\hat{c}_j=\sum_{k_1=0,k_2=0}^{N-1}\omega^{j\times(k_1+k_2)}h_{k_1}x_{k_2}$$
The IDFT matrix is the complex conjugate of $W$, therefore, $W_{ij}^\ast=\omega^{-i\times j}$; therefore $y_i=W_{ij}^\ast \hat{c}_{j}=\sum_{k_0=0,k_1=0,j=0}^{N-1}\omega^{j\times(k_1+k_2-i)}h_{k_1}x_{k_2}$. With $\omega$ is the $N$-th roots of unity. Summing over $j$ first, we thus obtain
$$\begin{equation}\sum_{j=0}^{N-1}\omega^{(k_1+k_2-i)\times j}=\begin{cases}N\quad &k_1+k_2=i\\ \frac{1-\omega^{(k_1+k_2-i)\times N}}{1-\omega^{k_1+k_2-i}}=0 &k_1+k_2\ne i\end{cases}\end{equation}$$
Which gives $y[i]=\sum_{k_1+k_2=i}a_{k_1}b_{k_2}=\sum_j a[j]b[i-j]$ being the discrete convolution. $Q.E.D.$
Therefore, the matrix multiplication can also be accelerated by FFT method if circulant matrices are involved. The extra $1/\sqrt{N}$ comes from IDFT.
In[1]: a = np.random.random((5,))
In[2]: b = np.random.random((5,))
In[3]: np.allclose(np.fft.ifft(np.fft.fft(a,n=9, norm='ortho')*np.fft.fft(b, n=9,norm='ortho'),
....: n=9,norm='ortho').real*9**0.5,np.convolve(a,b,'full'))
Out[3]: True
The $W^\ast \cdot W=\sum_{k=0}^{N-1}\omega^{k(i-j)}=\delta_{ij}=\mathbb{I}$ where $W^\ast$ represents the inverse DFT operator, is the complex conjugation transpose of $W$. Appling twice of FT on a function $f(x)$ yields $f(-x)$: $W^2_{ij}=\sum_{k=0}^{N-1}\omega^{k(i+j)}=\delta_{i+j\equiv0 \bmod N},i,j\in[0,N-1]$, which means $\mathrm{FT}\{\mathrm{FT}\{f(x)\}\}=f(-x)$, e.g., for $5\times5$ sized DFT matrix:$$W^2=\begin{bmatrix}
1 & 0 & 0 & 0 & 0\\
0 & 0 & 0 & 0 & 1\\
0 & 0 & 0 & 1 & 0\\
0 & 0 & 1 & 0 & 0\\
0 & 1 & 0 & 0 & 0\\
\end{bmatrix}$$
Clearly, $W^2\cdot (x_0, x_1, x_2, x_3, x_4)^T=(x_0,x_4,x_3,x_2,x_1)^T$. In fact, it is easily to verify that the eigenvectors of a circulant matrix are Fourier modes: $v_j=1/\sqrt{N}(1, \omega_j, \omega_j^2,...,\omega_j^{N-1})^T$ for $\omega_j = \exp(2 \pi j \mathbb{i}/n)$, and corresponding eigenvalue $\lambda_j=c_0+c_{N-1}\omega_j+c_{N-2}\omega_j^2+...+c_1\omega_j^{N-1}$, with $C\cdot v_j=\lambda_j v_j$, the 'circulant' property comes from that $\omega$ is generator of a cyclic group.