With the development of Internet, the amount of data is increasing at the speed of TB and PB per second. How to obtain valuable knowledge from data has become a hotspot problem. As a kind of classical pattern classification method, Back Propagation Neural Networks model (BP) is an effective tool to obtain information, and has been widely used in various fields. However, traditional BP algorithm is essentially an iterative solution method for the parameters optimization, which suffers from slow convergence and a local optimal solution problem. In order to solve above-menthod problems, extreme learning machine (ELM) has been proposed to train a single hidden layer feedforward neural networks (SLFNs) method, which is similar to BP. In ELM model, the nonlinear mapping is completed in the input layer, and the hidden nodes are randomly initiated and then fixed without iteratively tuning. Therefore, not only ELM has fast speed in solving pattern classification problems, but is also not easy to be trapped into local optimal solution. Meanwhile, this method becomes an effective way to solve the pattern classification problem. We aim to improve the classification speed, accuracy and stability based on ELM theory in this thesis. Research and analysis are conducted to solve these problems in different applications, and following achievements are obtained:(1) The efficiency of ELM is limited when executes the classification problem on the large-scale data in a single machine. In order to solve this problem, this thesis proposes a novel algorithm which is called Parallel Online Sequence Extreme Learning Machine (POSELM). The process of POSELM is as follows:firstly, calculate the output weight matrix of the hidden layer with the theory of ELM. Secondly, this matrix is partitioned into several matrix blocks based on the characteristics of the MapReduce framework so as to substitute the original large-scale matrix multiplication operation with calculation the matrix blocks in multi-work nodes in parallel. Finally, the values in calculation nodes are merged by the key values and we can acheive the output layer weight. Under the premise that the original calculation accuracy is guaranteed, we extend the online extreme learning machine algorithm to the MapReduce framework, and realize real-time classification in dynamic data.(2) ELM has lower robustness in the classification process due to the randomized theory. To solve the problem, this thesis proposes a novel algorithm that is called local extreme learning machine (LELM). The algorithm is summarized as follows:firstly, obtain K nearest neighbors of each testing sample to judge the location of the testing sample, and identify noisy samples. Secondly, combine the proposed supervised clustering method with the nearest neighbor method to reconstruct the local training set. Finally, establish the local classification model for the new training set. LELM obtains the local classification model by reconstructing the local training set, which has following advantages:①The local structure of samples is fully considered. ②The model obtains the balance between variance and bias. ③Reduce the impact of the noisy samples on the construction of the classification model, which improve the robustness and stability. Furthermore, in order to improve the robustness of incremental classification model, this thesis proposes a novel algorithm that is called Self-Compounding Kernel Online Sequential Extreme Learning Machine (SCK-OSELM) which is developed by OSELM (Online sequence extreme learning machine). The steps of this algorithm are as following:firstly, map the input samples into multiple kernel spaces to obtain different high-dimension features, and calculate the nonlinear combination of features. Secondly, introduce the prior distribution of training samples as a model weights to maintain the model generalization, and utilize super weight to make the posterior distribution of weights to zero, thus we can achieve the sparse parameters. Finally, incorporate the sparse obtained parameter into the next moment common operations. In SCK-OSELM method, the fused features of multi-kernel space avoids the problem of the randomly parameters, which improves the robustness for incremental classification. Meanwhile, the robust incremental learning process not only preserves the traceability of the model, but accelerates the calculation speed.(3) ELM is difficult to classify the imbalance dynamic data. In order to solve this problem above, this thesis proposes a novel algorithm which is called Weights Robust Online Extreme Learning Machine (WROSELM). The algorithm can be summarized as follows: Firstly, generate the local dynamic weighted matrix based on cost-sensitive learning theory in real-time, thereby optimize the empirical risk of the classification model. Then, decompose the weight matrix, and obtain the output weights by the new incremental expression. Finally, WROSELM takes into consideration the data distribution changing which is caused by temporal properties change of dynamic data, thus it introduces the forgetting factor to enhance the sensitivity of the classifier to the change of data distribution. This method gets a better adaptability to the problem of imbalance data classification, and obtains the real-time and efficient performance.(4) ELM is difficult to classify the multi-feature data. In order to solve this problem above, this thesis proposes a novel algorithm which is called Multi-feature Extreme Learning Machine (MFELM). The algorithm can be summarized as follows:firstly, nonlinearly fuse multi-features of the same sample. Secondly, obtain the output weights by the fused feature and the error of single-feature respectively. Finally, obtain the combination parameters that are the minimum of the current classification error-sum by the iterative calculation and the output weights. In order to avoid the phenomenon that only single-feature is used in solving process, the higher order coefficients is inducted. Meanwhile, enhances MFELM to the kernel method that is called multi-feature kernel extreme learning machine (MFKELM). It avoids the problem that multi-feature always locate in different dimensional spaces which challenges classification processing, thus the fused feature can be used to provide information for classification, and improve classification performance. |