Font Size: a A A

Convergence Analysis Of Gradient Learning Methods For Feedforward Neural Networks

Posted on:2013-01-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:J WangFull Text:PDF
GTID:1118330371996631Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
It has attracted more and more researches on neural networks for its significant non-linear mapping capability. Now, neural networks have been widely used in various fields including patter recognition, function approximation, classification, non-linear control etc. The gradient method is one of the most famous application on training neural networks. There are mainly two different learning models, that is, batch learning and incremental learning. In addition, incremental learning model can be divided into three forms by the sample's order, that is, fixed order, special stochastic order and completely stochastic order. Due to the drawbacks such as slow convergence and poor generalization, momentum term or penalty term is often used to modify the standard backpropagation algorithms. The research of this dissertation focuses on the deterministic convergence analysis for some specific learning methods.The organization of this dissertation is as follows. Some background information about feedforward neural networks is reviewed in Chapter1.In the second chapter, the deterministic convergence for a Double Parallel Feedforward Neural Network (DPFNN) is studied. Gradient method is used for training DPFNN with finite training sample set. The monotonicity of the error function in the training iteration is proved. Then, some weak and strong convergence results are obtained, indicating that the gradient of the error function tends to zero and the weight sequence goes to a fixed point, respectively. Numerical examples are provided, which support our theoretical findings and demonstrate that DPFNN has faster convergence speed and better generalization capability than the common feedforward neural network.In the third chapter, we assume that in each training cycle, each sample in the training set is supplied in either stochastic order or fixed order to the network exactly once. It is interesting that these stochastic learning methods can be shown to be deterministically convergent. Some weak and strong convergence results for the learning methods are guaranteed. The conditions on the activation function and the learning rate to guarantee the convergence are relaxed compared with the existing results. Our convergence results are valid for not only S-S type neural networks (both the output and hidden neurons are Sigmoid functions), but also P-P, P-S and S-P type neural networks, where S and P represent Sigmoid and polynomial functions, respectively. In the fourth chapter, a re-start strategy for the momentum is adopted such that the momen-tum coefficient is set to zero at the beginning of each training cycle. Corresponding weak and strong convergence results are then proved. The convergence conditions on the learning rate, the momentum coefficient and the activation functions are much relaxed compared with those of the existing results.In the last chapter, weight-decay method as one of classical complexity regularization is considered for multi-layer perceptron network. The convergence results are guaranteed under some relaxed conditions for the activation functions, learning rate and the the stationary set of error function. Especially, the boundedness of the weights in the training procedure is obtained in a simple and clear way.
Keywords/Search Tags:Neural Networks, Momentum Term, Penalty Term, Batch Learning, IncrementalLearning, Boundedness, Convergence
PDF Full Text Request
Related items