Font Size: a A A

A Research Of Cost-sensitive Classification Methods Based On LGC

Posted on:2016-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:X M HanFull Text:PDF
GTID:2308330461978276Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Inconsistent misclassification costs do exist in many reality problems, such as problems in finance, medicine and other fields. In such problems, the minimum global cost becomes the main objective of classifiers, rather than the overall classification accuracy. Such that, traditional semi-supervised classification methods are no longer applicable. while cost-sensitive methods provide cost-sensitive features, which can reflect misclassification cost differences, to solve this kind of problem.Semi-supervised learning can effectively complete the learning task that are more in line with reality when little information is known and assumptions are more stringent. In this paper, we make full use of large amounts of unlabeled data and combine the semi-supervised learning with cost-sensitive learning.we propose an algorithm in this paper to provide lower misclassification cost by sacrificing the overall accuracy to an acceptable extent.First of all, we divide cost-sensitive learning algorithms into the level of data and algorithm, respectively, according to researches and analysis of related cost-sensitive learning algorithms. Secondly, we introduce the cost sensitivity features to the classic semi-supervised algorithm of LGC, making it cost-sensitive and well-performed on the overall classification accuracy and misclassification cost. Thirdly, considering the defects of unstable performance (like error accumulation and so on) brought by imbalanced datasets, we optimize the CS-LGC algorithm to CSS-LGC algorithm based on the thoughts of SMOTE. And meanwhile, we analyze how to select reasonable thresholds in our methods. Comparative experimental results verify the effectiveness of CS-LGC and CSS-LGC in cost-sensitive classification problems.
Keywords/Search Tags:Graph based semi-supervised classification, cost-sensitive, LGC, rescale, SMOTE
PDF Full Text Request
Related items