Font Size: a A A

Research On Several Kinds Of Dynamical Systems In Deep Neural Networks

Posted on:2022-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:J LinFull Text:PDF
GTID:2480306530472484Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Deep neural networks have become the most advanced model in many machine learning tasks.However,general theoretical guidance on network architecture design is still lacking.Based on literature [1] and literature [20],this paper mainly discusses the relationship between deep neural networks and dynamical systems,especially the relationship with the Hamiltonian system,and then proposes a new network architecture: the generalized Hamiltonian system.This paper consists of four chapters:The first chapter introduces the research background and content arrangement of this paper.Firstly,we introduce the relationship between the deep residual network and dynamical systems,then the idea of generating a deep residual network from the dynamic system and the corresponding discrete format is derived.Based on this idea,in [20],Haber et al.proposed new forward propagation techniques inspired by systems of ordinary differential equations that overcome the exploding or vanishing gradients and lead to well-posed learning problems for arbitrarily deep networks.During the construction of this method,they also proposed a so called stability criteria and two stable network models: Antisymmetric model and Hamiltonian model.The second chapter introduces the prerequisite knowledge needed in subsequent chapters.The third chapter,we conduct a theoretical discussion on the Hamiltonian network model proposed by predecessors.Firstly,a counter example is given to illustrate that the stability criterion proposed in [20] is not rigorous,and two new stability criterions are proved in Theorem3.1 and Theorem3.2.Then we discuss the properties of the Hamiltonian model(1.16),and obtain Theorems 3.3 about its linear stability and Theorem3.4about its equilibrium points.Although the construction of the Hamiltonian model(1.16)uses the idea of the Hamiltonian system,in fact we cannot solve its hamiltonian.We find that if the system(1.16)is a classical Hamiltonian system,the restriction condition is relatively strict.Then,we relax the requirements,explore the possibility that the system(1.16)is a generalized Hamiltonian system,and derive the relevant constraints.Based on the so-called Hamiltonian model(1.16)proposed by predecessors,we construct a new system(3.16),which is equivalent to system(1.16)under certain conditions.The system(3.16)has the structure of the generalized Hamiltonian system,and then its dynamic properties are discussed in detail,and the leaf structure and orbital properties of phase space are obtained: linear stability Theorem 3.5,Theorem 3.6 on Casimir function existence,Theorem 3.7 on equilibrium distribution,Theorem 3.8 on nonlinear stability and Theorem 3.9 on dimension reduction,and apply them to some specific examples.Moreover,specific numerical simulations in Literature [20] show that the Hamiltonian model(1.16)can effectively avoid the exploding and vanishing gradient phenomenon,but no theoretical discussion is given.This paper discusses the gradient estimation of both systems(1.16)and(3.16),and obtains the derivative of the new discrete scheme(3.54)with respect to weight parameter.In the fourth chapter,the deep neural networks(2.14)and(3.54)corresponding to(1.16)and(3.16)continuous dynamical systems are subjected to multi-level data experiments,and the experimental effects of these two deep neural networks are compared.The network(2.14)and the network(3.54)have similar simulation accuracy at medium depth(64 to 256 layers).For the prospects and problems of constructing deep networks from the perspective of dynamical systems,we put forward some of our views.Using dynamical system to construct deep network is not only convenient to study the effectiveness of deep network learning theoretically,but also increases the diversity of models and effectively reduces the parameters.The existing problem is how to retain more of the properties of the dynamical system in the process of data experiment realization,which involves the further expansion of the discrete method and gradient descent.
Keywords/Search Tags:Deep neural network, Dynamical system, Lyapunov Stability, Hamiltonian system
PDF Full Text Request
Related items