Font Size: a A A

Bp Neural Network Online Gradient Method With A Penalty

Posted on:2005-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:L Q ZhangFull Text:PDF
GTID:2208360122997296Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Multilayer perceptron networks have been widely used in many applications. The generalization ability, i.e.. how well the network performs on the data sets that have not been shown to the network for training, is an important criterion of a network's performance. A rule of thumb for improving the generalization is to choose the smallest network that fits the training examples.Such small networks may mean either small number of connections or small magnitude of connections or both of them. One way to obtain network with small number of connections is to delete some unimportant connections and nodes after the training has been finished. For a general review of this see [e.g. 7, 15, 19]. Typically, methods for removing weights (connections) from the network involve adding a penalty term to the error function, unnecessary connections will have small weights, and therefore the complexity of the network can be reduced significantly by pruning [e.g. 22]. Even in the cases that pruning does not be carried out after the training process, the network can still have much reduced complexity due to small magnitude of weights, hence generalizes well [18, 26]. So adding a penalty term to the error function of BP network is an important approach to gain better generalization.A lot of works have been done on the using of different penalty terms such as 110. 12. 13. 22, 26], Most of them ([e.g. 10, 12, 22, 26])have their research on the basis of experiments, and do not give a mathematics proof that the weight is definitely bounded. Whereas Jun Kong & Wei Wu [13] provide such a proof for a very special and simple case, where the training examples are linearly independent (thus the number of training examples can not be larger than the dimension of the example vectors) and the network has no hidden layer. Our aim in this paper is to remove these restrictions and to consider a more realistic case where the number of training examples can be arbitrarily large and there is a hidden layer in the neural network. We give the mathematics proofs of the boundedness of the weights and the convergence of the network.
Keywords/Search Tags:BP neural network, Penalty term, Online gradient method, Boundedness. Convergence
PDF Full Text Request
Related items