Research On Semi-supervised Learning Classification Algorithm Based On Mult-view

Posted on:2015-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:P Sun

Full Text:PDF

GTID:2268330428997993

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Machine learning has been paid much attention as a popular research field incomputer science. In the past, learning machine learning mainly adopt two kinds oflearning pattern namely supervised learning and unsupervised learning, and datasetsconsist of low dimensions data (either totally labeled or totally unlabeled). However,with the development of machine learning and data acquisition techniques, datasetsbecome multiple dimensions with high relation among each attribute, plus few labeleddata and much unlabeled data. So the traditional machine learning methods has beenunable to learn effectively, and the coming thought-provoking problem is how to usethese kinds of data to learning efficient in various industries. Thereforesemi-supervised learning emerged that is able to combine a large number of unlabeledand few labeled data in learning.In recent years, with the sustaining development of variety technologies inmachine learning and data mining, semi-supervised learning has been greatly boostedin theory and practical application of machine learning. Semi-supervised learningresearch focuses mainly on design of learner with good performance in the case thatclass label is lacking for most of data in training dataset. The process ofsemi-supervised learning is using a little labeled data combined with a vast array ofunlabeled data to generate a model which has a good learning performance.Semi-supervised learning Naive Bayes classifier, as an excellent classificationalgorithm because of its simple, fast and high accuracy rate and other characteristics,has been widely used in classification tasks. Multi-view semi-supervised learningmethods is an important method, at first attribute sets of data will be divided intomany subsets, and each subset generate one classifier, then each classifier provide new labeled data for other classifiers, classifiers learn collaboratively. However, thereare still some unsolved problems in the learning of multi-view, for example, themethod of choosing high confidence data from unlabeled dataset to label and add intolabeled datasets; the problem of selecting appropriate quantity of data in the processmentioned above. Since the traditional Multi-view semi-supervised learningclassification algorithm does not take the performance of the individual classifiersinto consideration, each classifiers choose the average amount of data to label in eachiteration. Therefore, each classifier can’t play their best in classification.According to the problems discussed above, this paper proposed two confidenceestimating methods namely, Max Distinction and KNearest and their calculationformulas, and had experiment on selected certain percentage of data in the UCIdatasets, compared Macro-Recall, Macro-Precision and time, and proved theeffectiveness of Max Distinction. Then proposed a novel weight adjusting two-viewsemi-supervised classification method, improved the traditional two viewsemi-supervised learning classification algorithm. In the novel algorithm, eachclassifier select data from unlabeled dataset and insert into labeled datasets accordingto their classification ability, so that they could play their real weight in the learningprocess. The experimental results show that this algorithm can improve theperformance of Macro-Recall and Macro-Precision of the traditional two-viewsemi-supervised study classification algorithm.

Keywords/Search Tags:

Semi-supervised learning, Multi-view learning, Ensemble learning, Naive bayes, Classification

PDF Full Text Request

Related items

1	Comparison And Improvement Of Two Methods Based On Semi-supervised Learning
2	Comparison And Improvement Of Two Methods Based On Semi-Supervised Learning
3	A Framework For Ensemble Learning Based Heterogeneous Extreme Learning Machines
4	Semi-supervised Learning Based On Information Theory And Functional Dependency Rules Of Probability
5	Research On Semi-supervised Classification Algorithm Based On Integrated Neural Network
6	Study Of Supervised And Semi-supervised Multi-view Feature Learning Methods
7	Research On Image Classification Algorithm Based On Semi-supervised Learning
8	The Study Of Ensemble Learning On Naive Bayes Classifier
9	Semi-Supervised Learning With Multiple Views
10	Research And Implementation Of Semi-supervised Machine Learning Algorithms For Classifying The Imbalanced Protocol Flows