Font Size: a A A

Convergence Of Gradient Method For Recurrent Neural Networks

Posted on:2010-06-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:D P XuFull Text:PDF
GTID:1118360275458213Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
An artificial neural network(ANN),which is often called "neural network"(NN),is a mathematical model or computational model based on biological neural networks for processing information.According to structure,neural networks can be classified into two categories: Feedforward neural networks(FNNs) and Recurrent neural networks(RNNs).In FNNs,previous layer's output is the input of the next layer.The processing of the information has the direction of passing layer by layer.There are no cycles or loops in the network.FNNs achieve the mapping from input vector x to output vector y,which can be called as the static mapping.It can be used to deal with the time-independent objects such as character recognition and curve approximation.However,The mapping between two discrete time series x(t) and y(t) is often used in many fields such as nonlinear system modeling,controlling,fault diagnosis and prediction of time series,in which the output y(t) relies not only on x(t),but also on x(t-1),x(t-2),…,and y(t-1),y(t-2),…,which can be viewed as dynamic mapping.The networks dealing with such kind of problems should be a dynamic system,in which the memory function should be added.RNNs can cope with the time-varying input and output through their own delay.Thus,RNNs achieve the dynamic mapping.They are more appropriate to solve the problems in dynamic system than FNNs.As in the case of FNNs,the simple gradient searching algorithms is often used in training RNNs.The computation of gradient is also recursive for its recursiveness,whose learning is much more complicated than FNNs.One of the key research subjects of the gradient method for training recurrent neural networks is its convergence theory.The research on it not only helps us to understand the nature and character of the method but also provides the significant guidance for a large number of actual applicationsChapter 1 reviews the background information about the neural networks.Chapter 2 discusses the convergence of gradient method for fully recurrent neural networks. In this chapter,we put forward the monotonicity of the error function and convergence. The theoretical results are supported by numerical experiments.Chapter 3 considers the convergence of gradient method for training Elman networks with a finite training sample set.Monotonicity of the error function in the iteration is shown,on the basis of which weak and strong convergence results are proved,that is,the gradient of the error function goes to zero and the weight sequence goes to a fixed point,respectively.A numerical experiment is given to support the theoretical findings.Chapter 4 studies the influences of cutting the recursion in the gradient of the error function, whose aim is to reduce greatly the computational effort.We analyse convergence of this approximated gradient method for training Elman networks,and obtain that the error function is monotonically decreasing and its approximated gradient goes to zero in the learning process.Chapter 5 shows the equivalence of gradient method for training recurrent neural networks. The two classical gradient-based algorithms for recurrent neural networks are Real-Time Recurrent Learning(RTRL) and Back-Propagation Through Time(BPTT),respectively.For batch scheme,we prove that RTRL and BPIT are equivalent.The weight increment(s) they produced is the same.Chapter 6 gives the conclusion about the convergence on some improved learning algorithms for Recurrent NNs.
Keywords/Search Tags:Recurrent neural networks, Gradient method, Monotonicity, Convergence, Equivalence
PDF Full Text Request
Related items