Font Size: a A A

Research On Multi-label Classification Based On Decision Function

Posted on:2019-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:M X DingFull Text:PDF
GTID:2370330572452022Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of computer technology and the booming information society,the transaction data has a explosive emergence in daily life.How to obtain useful information from the mass of data and classify them reasonably,is one of the most important problem studying in the era of big data.Multi-label classification refers to that an instance is assigned to a collection of a number of different labels,this problem is equivalent to finding a multi-value decision function,which mapped each instance to a vector of two classes.Usually,there are binary relevance methods and chain classifiers for multi-label classification.When the Bayesian network augmented naive Bayes as the basic model,these two methods can induce a polynomial decision function.This paper is mainly based on the decision function to further study multi-label classification.The main research contents are as follows:Firstly,in order to improve the classification accuracy of the decision function,a new method of feature weight is proposed,which is called the probability feature weight.It takes the feature frequency ratio of the positive and negative instance as the weight.The weight comes from data-self,and can be better adapted to the changes of data-sets.In addition,the probability feature weight is added to the conditional probability estimation of decision function,which has a fundamental positive impact on the classification results.This method that the probability feature weight can better improve the classification quality of decision function through the experimental results on the multi-label data-sets.Besides,in order to simplify decision functions and reduce computational complexity,a discriminant theorem about decision functions irrelevance variables is proposed,which simplifies decision functions by selecting and eliminating irrelevant variables.Secondly,the research of imbalanced data is also a hot issue in recent years.In this paper,a multi-label unbalanced data classification is studied and it is based on decision function.The multi-label unbalanced data classification algorithm that assessments of cost and value is proposed.In multi-label classification,the imbalanced ratio is different for different labels,the imbalanced ratio reflects the importance of minority class classification.According to the imbalanced ratio,the value of correct classifing for minority class and the cost of misclassifing for majority class are evaluated,the majority class pays a reasonable price,the minority class as much as possible correct classification.Experime-ntal results show that our algorithm achieves better performance on multi-label unbalanced datasets by evaluation criteria such as cost and value,F1 metrics and recall.Finally,the work of this paper makes a brief summary,prospect,and a plan to further study in the future.
Keywords/Search Tags:Decision function, Bayesian network, Probability feature weight, Unbalanced data, Multi-label classification
PDF Full Text Request
Related items