Font Size: a A A

Research Of Query-by-committee Method Of Active Learning

Posted on:2011-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LiangFull Text:PDF
GTID:2178330332464711Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With Machine learning methods being widely applied for real world data analysis and data mining such as fingerprints, image retrievals, credit analysis, recommend page and so on, active learning has drawn wide attention in the area of pattern recognition and machine learning and has been given a great development in theory and practical application of research.Active learning can actively select the most useful training examples to improve the performance of learning where the label data is smaller. It has changed the way of traditional machine learning. Active learning algorithms based on support vector machine and based on the query by Committee are two of more algorithms. The both algorithms have some drawbacks such as not very high learning efficiency, needing the large examples to label and poor learning ability to unbalanced data set.In this paper, we firstly review some active learning schemes and discuss some theoretical research issues and applications which will bring more challenges for researchers in this field. Then we introduced our research work in this field in detail, which can be divided into three parts:(1) Discuss the active learning based on uncertainty-reduced. Studies the based on support vector machine learning algorithm. This paper presents a support vector machine based SMOTE active learning algorithm It can reduce the number of training examples effectively and resolve the bias of the optimal hyper plane of SVMs when samples are unbalanced.(2) Studies the based on query by committee active learning algorithm. Summarizes the methods of evaluating divergence between members of the committee. Deeply studies the strategies of constructing the committee in QBC active learning. An improved algorithm based on the QBC active learning is proposed in the paper. Algorithm is mainly reflected in three aspects Firstly. The improved algorithm uses the selective ensemble to construct the committee, and selects the best subset to make an ensemble. For effective active learning, it is critical that the committee be made up of consistent hypotheses that are different from each other, Secondly: in the process of selective ensemble, the algorithm use particle swarm optimization to select the weight of the members of the Committee. The PSO has advantages of higher accuracy, faster convergence and easy to operate. Thirdly, the improved algorithm combines vote entropy and kullback-leibler for selecting the unlabelled samples.(3) Studies Decorate and improves the method of artificial virtual samples. The new method uses the training data, candidate data and test data to obtain the mean and variance. We generate artificial training data by randomly picking data points from the Gaussian distribution defined by mean and variance of all data. An improved Active-Decorate active learning algorithm is proposed combination improved active-decorate and selective ensemble.
Keywords/Search Tags:Active learning, selective ensemble learning, particle swarm optimization, SMOTE, Active-Decorate
PDF Full Text Request
Related items