Font Size: a A A

Object Classification Based On Semi-supervised Learning

Posted on:2011-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z F ChuFull Text:PDF
GTID:2178360308452521Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Object classification is one of the basic problems in machine learning, which aiming at solving the classification and recognition problem on text, image, and video data. In the case of small amount of data, traditional machine learning methods have already achieved a sound performace. However, as the exponential booming of information, it is impossible to obtain such a large amount of data with labels, which leads to ineffectivity of traditional methods. In such scenario, semi-supervised learning methods become a hot point in research. It uses small amount of data with labels and extends them to unlabeled data to fill the quantity gap of labeled examples and unlabeled examples.In this thesis, we focus on a typical semi-supervised learning problem with small amount of high-accurate labels and large amount of low-accurate labels. We also propose the robustness factor of co-training, which denotes the influence of initial incorrect labels to co-training process.Based on robustness problem of co-training, we originally propose an unsupervised pseudo-label-generating method based on the combination of information bottleneck principle and the method of posteri. In comparison with existing methods, this improvement needs smaller amount of labels and requires lower computation complexity.In applying pseudo-labels, we creatively discover a pseudo-label-aided co-training method. Comparing with existing methods, this method is more robust to initial incorrect labels. This improvement can guide co-training to obtain better classifiers even in the case that there are many incorrect labels in the labeled data.We also raise a theoretical analysis on this improvement in the angle of statics. We also mathematically prove the effectiveness in boosting robustness of co-training and discuss the similarity of Naive Bayes Classification and Information Bottleneck Principle.
Keywords/Search Tags:Semi-supervised learning, Co-training, Pseudo-labels, Information bottleneck principle
PDF Full Text Request
Related items