Contributions To Several Issues Of Multi-Label Learning

Posted on:2012-06-20

Degree:Master

Type:Thesis

Country:China

Candidate:J Huang

Full Text:PDF

GTID:2178330335990379

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Multi label learning is one the hotspots of machine learning and data mining recently, it is widely used in text categorization, webpage classification, semantic scene classification and classification of gene functions in Bioinformatics. Researching on multi-label learning has its practical significance and application value. Researchers have proposed many efficient solutions and methods, but there are still many problems worthy of study.Data set consists of samples which represent by one instance and each instance has only one label. There are samples which represent by the same instance but with different corresponding labels. Such classification problem also belongs to SIML learning. These samples would be predicted to one of the labels which they belongs to, by traditional classifier.The classify results for some of these samples will be thinked uncorrect, when traditional accuracy evaluation method was used. Infact, the label that the classifier predicted is one of the label set which the multi-label instance belongs to simultaneously, the result of classification is correct, and it is not considerate by current accuracy evaluation methods. In real applications, when the number of samples for each label which a multi-label instance belong to, is imbalanced. Different results predicted by the classifier reflect different performance of the classifier, while current accuracy evaluation criterion can't distinguish the performance of different classifiers effectively; three aspects about this problem were researched in this paper.Some of the existing multi-label learning algorithms, a real valued function was learned first, it reflects the degree of a sample belongs to a kind of category, and then a minimum threshold was set to decide whether the sample belongs to the category. If the threshold was set too high, the labels will not be predicted completely, if it was set too low, extra labels will be predicted by the classifier, and the distributions of each category are different, so it's improper to setting a unified minimum threshold for all categories. In this paper, minimum threshold will be set for each label, according to different distributions for each label. One label will be predicted only if the value of real valued function is bigger than the threshold which was set for this label.

Keywords/Search Tags:

Machine Learning, Data Mining, Multi Label Learning, Single Label Learning, Classification, Accuracy Evaluation, Threshold Determination, Class Imbalance

PDF Full Text Request

Related items

1	Study Of Multi-label Class Imbalance Classification Based On Extreme Learning Machine
2	Multi-label Learning Based On Label Weight And Weighted Kernel Extreme Learning Machine
3	Class-imbalance Issue In Applying Multi-label Learning To The Study Of Parkinson In Traditional Chinese Medicine Diagnosis
4	Imbalanced Multi-label Learning Algorithm Based On Density Label Space
5	Study On Class Imbalance Problem In Multi-Lable Image Classification
6	Research On The Utilization Techniques Of Partial Label Data
7	Research On Multi-label Learning Algorithm For Entity Information Mining
8	Research On Machine Learning Algorithms For Data With Multiple Annotations
9	New Label Learning For Multi-Label Image Classification
10	Multi-label Learning Algorithm And Its Application In Product Evaluation And Scoring