Font Size: a A A

Kernel-based Research And Application Of Integrated Learning Algorithm,

Posted on:2010-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:K KangFull Text:PDF
GTID:2208360275463032Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Ensemble Learning is a machine learning technique which trains a group of learner for a practical problem, and joins those learners together to execute the prediction task. It is paid close attention by many scholars and became a hotspot field of the machine learning because it can significantly improve the generalization ability of the learning systems. Ensemble Learning has been widely applied to biology authentication, sensor failure fault-tolerant, character recognition, identification of sources, linguistics, medicine, transport, management and other fields.The purpose of Ensemble Learning is to make full use of the advantage of each individual learner in its respective field, and improve the generalization capability of all the learning systems. Now it is generally believed that the key of ensemble learning is to effectively produce the individual learner with higher generalization ability and differences. Traditional ensemble learning algorithm does not make full use of data sets and the characteristics of individual learner to improve the diversity amongst them. It is mainly shown in two aspects: firstly, it did not take full advantage of the different characteristics of the local space in the process of sampling data from dataset; secondly, it does not make full use the information produced in the training. Recent years, some scholars have applied the kernel function to the ensemble learning and achieved good results. In this paper, the research goal is to solve the two disadvantages existed in the Traditional ensemble learning algorithms through applying the kernel function to the ensemble learning.The main contributions of this dissertation are summarized as follows:Firstly, the history and basic concepts of Ensemble Learning are briefly introduced; Boosting, Bagging and Stacking, the basic idea and theory of those representative Ensemble Learning algorithms are introduced in detail; the new idea of current study in Ensemble Learning - selective integration is also introduced; the history and the foundation of kernel function are introduced at last.Secondly, a novel ensemble classifiers algorithm based on kernel dataset partition (KFMCE) is proposed and implemented. The algorithm divides the original space using the kernel-based fuzzy membership according to the difference of local space, and then trains individual learners in order to make their performance better in the local space, finally, ensemble all to improve the overall performance. The kernel-based fuzzy membership is an extension for distance-based membership, it solves the membership in the high dimension space which is made by the mapping of original space. The kernel-based fuzzy membership can eliminate the bias of dataset in the expressing the data distribution. Using 20 different UCI data sets, we make the experiment on the Weka platform. Compared with AdaBoost and Bagging algorithm, the experimental results show that the proposed method has a higher classification accuracy and better generalization ability.Thirdly, a Clusterer Ensemble Algorithm Based on Dynamic Cooperation (DCCE) is proposed. The algorithm trains a number of basic clusterer at the same time, and during the process of training, make all clusterers dynamically cooperating and adjusting through using the information produced by each clusterers in the process of iterative, thus improve the generalization and computational efficiency of ensemble clusterer. In the process of cooperation, kernel–based aligning function coordinates the intermediate clustering results,and then the impulse items adjusts the intermediate clustering results for the purpose of cooperating amongst clusterers and controlling the diversity of individual clusterer. We apply the DCCE algorithm at 15 different UCI data sets; the experiment results show that the clustering algorithm has higher cluster ability.We apply the KFMCE algorithm to text classification, and select 20Newsgroup as data set for experiment; the results show that the KFMCE algorithm has higher classification ability in the text classification.
Keywords/Search Tags:Ensemble Classification, Kernel Function, kernel-based fuzzy membership, distribution of samples, Ensemble, Aligning Function, dynamic collaboration, text classification
PDF Full Text Request
Related items