Font Size: a A A

Research On Multi-label Classification Based On Possibility Theory

Posted on:2019-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:J BaiFull Text:PDF
GTID:2370330572952022Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and network technology,there are more and more multi-label data sets in different fields such as text,pictures,audio.As an important data mining tool,classification technology aims to extract useful information from a large number of data and classify samples of unknown labels.At present,mature classification algorithms are mostly used to handle the complete data,but in some practical applications,due to the lack of expert knowledge and errors in instrumental measurement,imprecise data is widely available.In recent years,possibility theory has been used to deal with uncertain information.In this paper,the multi-label classification algorithm based on possibility theory is researched for imprecise data.At first,the paper presents a weighted algorithm,by measuring the degree of importance of each attribute variable and the contribution of different values in the classification to improve the existing naive possiblistic classifier.Similar to the role of information gain in probability theory,the non-specificity of gain is the basis for evaluating uncertain information in the possibility theory.Therefore,the measurement method in our paper is characterized by non-specificity of gain.Using MATLAB software to carry out the simulation experiment for eight data sets of UCI,the results show that the proposed weighted naive possiblistic classifiers have better classification performance on most data sets.Next,the paper uses the possibility theory to study the multi-label classification problem by combining the possiblistic classifier in the one-dimensional classification.We propose a binary relevance algorithm based on possibility,which transforms multi-label tasks into multiple one-dimensional classification tasks based on the independent assumption of label variables.In order to simplify the model and deal with the imprecise data,we select the naive possiblistic classifier as the base classifier in each one dimensional classification.Considered the correlation between label variables,the paper presents a simple possible chain classification algorithm for trees.This model classifies each label of the prediction sample by establishing the chain structure between labels,and the specific process is as follows:(1)using the non-specific gain to calculate the weight,the maximum weight spanning trees was established;(2)identifying a chain sequence between the labels,the label that corresponds to the maximum weight as the parent node;(3)according to the chain sequence,learning a possiblistic classifier for each label variable in turn,at the same time,the label value of the parent node of the class variable is added as an extension attribute to the attribute set of the training data;(4)the predicted values of each label are combined into the final prediction vector.In the experimental part,compared with the traditional classification algorithms,the paper verifies the validity of the proposed classification algorithm for the precise data and the robustness for the imprecise data.Finally,we summarize the research contents and achievements,and look forward to further development direction in the future.
Keywords/Search Tags:Multi-label classification, Possibility theory, Naive possibilistic classifier, Non-specificity of gain, Imprecise data
PDF Full Text Request
Related items