Font Size: a A A

Study On Key Technologies Of Active Learning In Division Classification Model

Posted on:2011-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:W T GaoFull Text:PDF
GTID:2178360302994714Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Active learning can avoid classification model to accept sample passively by selecting the most informative examples for experts to label in machine learning. By learning fewer highly informative samples it achieves high-performance classifier. The algorithm should control the number of samples labeled strictly, because it is difficult for experts to label increasing number of unlabeled samples. This paper improves some active learning and sample selection methods to deal with some problems during the process of sampling according to some new thoughts.Firstly, considering uncertainty confidence model and representative confidence model, controlling active learning algorithm based on uncertainty and representative of data selection is proposed in this paper, and it can control the number of labeled samples by setting the appropriate parameters of classification accuracy changing each time.Secondly, while the existed algorithm has some problems on the way for combining the uncertainty confidence model and representative confidence model on the basis of the prototype based active learning, so there will be better result if adaptive considered a partial dependent coefficient-weighted function. In order to solve the problem an algorithm of different sample attributes-based robust and partial dependent active learning is proposed in this paper. The algorithm emphasizes a certain characteristic of data by introducing a partial dependent coefficient-weighted function which generally considered different sample attributes meanwhile, and solves the problem that uncertainty confidence model and representative confidence model can not be adaptive for different samples.Finally, in order to reduce the size of large-scale training sets and learning cost effectively, a training sample selection algorithm for SVM based on modified weighted condensed nearest neighbor and close-to-boundary criterion is proposed in this paper. The algorithm solves sensitive issue for initial value and improves execution efficiency by using subtractive clustering and random small pools respectively, achieves to eliminate the redundancy.
Keywords/Search Tags:Classification model, Active learning, Sample selection, Support vector machines, Uncertainty sample, Representative sample
PDF Full Text Request
Related items