Renormalization Group in Machine Learning

I worked on project in 2023 fall.

Abstract

In the phase transition theory, the correlation length diverges to infinity at the critical point which leads to many physical phenomena. We can use the mean-field theory and renormalization group theory to analyze these phenomena. But actually, similar phenomena happen not only in the phase transition but in many natural complex systems. Moreover, in these complex systems, people often use statistics to build the mathematical model. The neural network framework is a kind of algorithm that can be applied to justify which categorical data an individual belongs to. This process is similar to using the renormalization group theory to compute which phases a system belongs to. Therefore, although the precise mathematical structure hasn’t been built yet, we believe that the neural network framework can be viewed as the renormalization group theory.

Neural Network

Framework of Model

\(a^{[1]}_1=g(x_1 W^{[1]}_{11}+x_2 W^{[1]}_{21}+x_3 W^{[1]}_{31}+x_4 W^{[1]}_{41}+b^{[1]}_1)\\ a^{[1]}_2=g(x_1 W^{[1]}_{12}+x_2 W^{[1]}_{22}+x_3 W^{[1]}_{32}+x_4 W^{[1]}_{42}+b^{[1]}_2)\\ a^{[1]}_3=g(x_1 W^{[1]}_{13}+x_2 W^{[1]}_{23}+x_3 W^{[1]}_{33}+x_4 W^{[1]}_{43}+b^{[1]}_3)\\ a^{[1]}_4=g(x_1 W^{[1]}_{14}+x_2 W^{[1]}_{24}+x_3 W^{[1]}_{34}+x_4 W^{[1]}_{44}+b^{[1]}_4)\\ a^{[1]}_5=g(x_1 W^{[1]}_{15}+x_2 W^{[1]}_{25}+x_3 W^{[1]}_{35}+x_4 W^{[1]}_{45}+b^{[1]}_5)\)

\(\bf{W}^{[1]}=\begin{pmatrix} {W}^{[1]}_{11}&{W}^{[1]}_{12}&{W}^{[1]}_{13}&{W}^{[1]}_{14}&{W}^{[1]}_{15}\\ {W}^{[1]}_{21}&{W}^{[1]}_{22}&{W}^{[1]}_{23}&{W}^{[1]}_{24}&{W}^{[1]}_{25}\\ {W}^{[1]}_{31}&{W}^{[1]}_{32}&{W}^{[1]}_{33}&{W}^{[1]}_{34}&{W}^{[1]}_{35}\\ {W}^{[1]}_{41}&{W}^{[1]}_{42}&{W}^{[1]}_{43}&{W}^{[1]}_{44}&{W}^{[1]}_{45}\\ \end{pmatrix},\) \(\bm{x}=\begin{pmatrix} x_1\\ x_2\\ x_3\\ x_4 \end{pmatrix},\;\bm{a}^{[1]}=\begin{pmatrix} a_1^{[1]}\\ a_2^{[1]}\\ a_3^{[1]}\\ a_4^{[1]}\\ a_5^{[1]} \end{pmatrix},\;\bm{b}^{[1]}=\begin{pmatrix} b_1^{[1]}\\ b_2^{[1]}\\ b_3^{[1]}\\ b_4^{[1]}\\ b_5^{[1]} \end{pmatrix}\)

\(\bm{a}^{[1]}=g(\bf{x}\bf{W}^{[1]}+\bm{b}^{[1]}) \)

\(\bm{a}^{[2]}=g(\bm{a}^{[1]}\bf{W}^{[2]}+\bm{b}^{[2]})\)

\(\bm{a}^{[3]}=g(\bm{a}^{[2]}\bf{W}^{[3]}+\bm{b}^{[3]})\)

\(\bm{z}=g(\bm{a}^{[3]}\bf{W}^{[4]}+\bm{b}^{[4]})\)

\(\bm{a}^{[i+1]}=g(\bm{a}^{[i]}\bf{W}^{[i+1]}+\bm{b}^{[i+1]})\)

\(g(\bm{z})=\frac{1}{\sum_je^{z_j}}\begin{pmatrix} e^{z_1}\\ e^{z_2}\\ \vdots\\ e^{z_n} \end{pmatrix}\)

Training Model

Renormalization Group

I believe that the pruning of the neural network can be interpreted by the physical interpretation of the renormalization group. I would use some numerical experiments to explain this statement. I use the R language to build the neural network model, for it’s more convenient to see more statistical data and compute some statistical quantity

Samples at the Critical Point

Correlation matrix of the *iris*
	Sepal.Length	Sepal.Width	Petal.Length	Petal.Width
Sepal.Length	1.0000000	-0.1175698	0.8717538	0.8179411
Sepal.Width	-0.1175698	1.0000000	-0.4284401	-0.3661259
Petal.Length	0.8717538	-0.4284401	1.0000000	0.9628654
Petal.Width	0.8179411	-0.3661259	0.9628654	1.0000000

Point-Biserial Correlation for *iris*
Sepal.Length	Sepal.Width	Petal.Length	Petal.Width
0.7825612	-0.4266576	0.9490347	0.9565473

Neural Network as the Renormalization Group

Self-Similarity in the Neural Network

Confusion matrix for test set
	Setosa	Versicolor	Virginica
Setosa	6	0	0
Versicolor	0	10	3
Virginica	0	0	11

\(Accuracy=\frac{27}{30}=90\%\)

Confusion matrix for test set
	Setosa	Versicolor	Virginica
Setosa	5	1	0
Versicolor	1	7	3
Virginica	4	2	7

\(Accuracy=\frac{27}{30}=63.3\%\)

Interpretation of the Neural Network

\(\bm{a}^{[i+1]}=g(\bm{a}^{[i]}\bf{W}^{[i+1]}+\bm{b}^{[i+1]})\)

\(g(\bm{z})=\frac{1}{\sum_je^{z_j}}\begin{pmatrix} e^{z_1}\\ e^{z_2}\\ \vdots\\ e^{z_n} \end{pmatrix}\)

Part of the output of the prediction
setosa	versicolor	virginica
1.000000e+00	7.886223e-04	1.736939e-48
5.111283e-15	1.000000e+00	9.634484e-21
1.403715e-19	1.000000e+00	1.720226e-12

Futher Ideas

If we believe that the neural network is indeed a kind of renormalization group and there is a precise corresponding relation between them, then we can build a relation between statistical mechanics and other research fields.

Application in Statistics
Realization of Brain Dynamic in Biophysics
Precise Mathematical Structure in Machine Learning Theory

References

[1] Pankaj Mehta, David J. Schwab, An exact mapping between the Variational Renormalization Group and Deep Learning ,
[2] Cédric Bény, Deep learning and the renormalization group,
[3] Dietmar Plenz, The Critical Brain, Section on Critical Brain Dynamics, National Institute of Mental Health, NIH, Bethesda, MD 20892, USA April 22, 2013• Physics 6, 47

Hao-Yang Yen

Renormalization Group in Machine Learning

I worked on project in 2023 fall.

Abstract

Neural Network

Framework of Model

Training Model

Renormalization Group

Samples at the Critical Point

Neural Network as the Renormalization Group

Self-Similarity in the Neural Network

Interpretation of the Neural Network

Futher Ideas

Application in Statistics

Realization of Brain Dynamic in Biophysics

Precise Mathematical Structure in Machine Learning Theory

References