Font Size: a A A

Research On Some Problems In Machine Learning And Neural Network Learning

Posted on:2017-08-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:M C YaoFull Text:PDF
GTID:1318330488953079Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
In machine learning, the learning processes are usually divided into supervised learning and unsupervised learning depending on whether the labeled samples are used. As an essential compo-nent of machine learning, feed-forward neural network plays an important role in pattern recog-nition and function approximation. For supervised learning, training data are first mapped into hidden layer via input layer, and then are mapped into output layer after activated by activation function in the hidden layer, and finally certain learning rules are used to adjust the weights of the network. The structure of a feed forward network involves the number of input units, hidden units and output units, as well as the weights connection method among these units. Since the learn-ing speed and generalization performance of a neural network is closely related to the network structure and properties of the training samples, we focus on in this dissertation the optimization methods of single-hidden layer feed-forward network and generalization performance of learning process for time-dependent samples. The structure of this dissertation is organized as follows.In Chapter 1, some recent researches about machine learning and neural network learning, optimization of feed forward neural network and generalization performance of samples learning are introduced.In Chapter 2, since the dimension of samples is generally identical to the number of input units of a feed-forward neural network, we mainly study the attribute reduction method based on the rough sets theory. In particular, we present a new attribute reduction algorithm based on crisp set operations, in which traditional reduction operations such as conjunction law and disjunction law are converted to basic operations of crisp set. The instance shows that the proposed algorithm can effectively cut down the dimension of input samples. In this way, the number of input units of a neural network can be reduced if the reduction samples are used for learning.In Chapter 3, when feed-forward neural networks are used to solve multi-classification prob-lem, we propose a binary output approach instead of conventional "One-for-Each" approach. For multiple linear perceptions, we investigate the relationship between these two methods, and prove the following result:if the One-for-Each approach can solve k-classification problem (k? 8), then the binary approach can equivalently resolve the same problem with m? [log2(k+1)] output units.In Chapter 4, in order to optimize the hidden layer of feed-forward neural network, we pro- pose a double parallel extreme learning machine (DPELM) and corresponding online sequential algorithm based on double parallel feed-forward neural network and popular ELM algorithm. D-PELM can effectively resolve some classification problems and regression problems. Numerical experimental results show that the proposed methods may use less hidden units than those of the classical ELM, which could lead to better generalization performance of trained networks.In Chapter 5, when samples are taken from a certain stochastic process, we need to con-sider the corresponding supervised learning as statistical learning and study the generalization performance. We propose a theoretical framework to analyze the generalization bounds for time-dependent samples (TDS). In particular, we first partition the generalization bounds into four related components and then study the corresponding generalization bounds, respectively. Espe-cially, we give the deviation inequality and symmetrization inequality of quantity<?4, which could be the beneficial complement to classical statistical learning theory.
Keywords/Search Tags:machine learning, feed-forward neural network, optimization of network structure, generalization performance, generalization bounds
PDF Full Text Request
Related items