Study On Key Technologies Of Active Learning In Division Classification Model

Posted on:2011-05-30

Degree:Master

Type:Thesis

Country:China

Candidate:W T Gao

Full Text:PDF

GTID:2178360302994714

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

Active learning can avoid classification model to accept sample passively by selecting the most informative examples for experts to label in machine learning. By learning fewer highly informative samples it achieves high-performance classifier. The algorithm should control the number of samples labeled strictly, because it is difficult for experts to label increasing number of unlabeled samples. This paper improves some active learning and sample selection methods to deal with some problems during the process of sampling according to some new thoughts.Firstly, considering uncertainty confidence model and representative confidence model, controlling active learning algorithm based on uncertainty and representative of data selection is proposed in this paper, and it can control the number of labeled samples by setting the appropriate parameters of classification accuracy changing each time.Secondly, while the existed algorithm has some problems on the way for combining the uncertainty confidence model and representative confidence model on the basis of the prototype based active learning, so there will be better result if adaptive considered a partial dependent coefficient-weighted function. In order to solve the problem an algorithm of different sample attributes-based robust and partial dependent active learning is proposed in this paper. The algorithm emphasizes a certain characteristic of data by introducing a partial dependent coefficient-weighted function which generally considered different sample attributes meanwhile, and solves the problem that uncertainty confidence model and representative confidence model can not be adaptive for different samples.Finally, in order to reduce the size of large-scale training sets and learning cost effectively, a training sample selection algorithm for SVM based on modified weighted condensed nearest neighbor and close-to-boundary criterion is proposed in this paper. The algorithm solves sensitive issue for initial value and improves execution efficiency by using subtractive clustering and random small pools respectively, achieves to eliminate the redundancy.

Keywords/Search Tags:

Classification model, Active learning, Sample selection, Support vector machines, Uncertainty sample, Representative sample

PDF Full Text Request

Related items

1	The Study And Improvements Of Uncertainty-based Sample Selection
2	Research And Application Of Network Intrusion Detection Technology Based On Active Learning Support Vector Machine
3	The Inlfuence Of The Data Distribution Over Support Vector Machines
4	Research On Sample Reducation And Combination Of Spatial Information For SVM In Remote Sensing Image Classification
5	Support Vector Machine Based On Boundary Sample Selection
6	Studies Of Some Problems In Support Vector Machines And Semi-supervised Learning
7	Sample Selection Algorithms Based On Sample Entropy And Pre-clustering
8	The Research Of Support Vector Machine Based On Sample Selection
9	Intrusion Detection Based On Support Vector Machines And Active Learning
10	Audio Scene Recognition Based On Sample Re-balancing And Transfer Component Analysis