Font Size: a A A

Neighbor Correlation Features Based Multi-label Classification Algorithm

Posted on:2019-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:J F FangFull Text:PDF
GTID:2428330563493326Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the arrival of big data,a large amount of multi-label data has been generated in all areas of real life.Accurate acquisition of sample labels will help improve the hit rate of text retrieval,image retrieval,and object recognition,while facing explosively increasing data,manually extracting valuable information is more difficult to accomplish,and automated drawing of sample tags through machine learning has become the main direction.When classifying multiple label samples,the traditional multi-label classification method is to learn the mapping relationship between sample features and tags.In this mapping,it is possible to predict the category tags of unseen samples without considering the relationship between the tags.The tag relationship can provide useful information other than the basic information,if data sparsity of a tag leads to low accuracy,the accuracy of the tag can be improved by the correlation of high accuracy tags.Therefore,considering the tag relevance can improve the accuracy of the algorithm.This paper summarizes the limitations of the existing different design methods and methods for adding label relevancy,and proposes a new method named MLNB for extracting label relevancy.Aiming at the existing problems of the existing neighboring methods,a new method to obtain the relevance of tags from neighboring tag information is designed.The main idea of the algorithm is to consider the actual situation,the local similarity of sample labels has relevance.The paired occurrences of the tags in the local samples are regarded as features,similar neighboring samples are searched,digging tag pairs occurrences from tag collections of similar samples from small clusters as tag correlation features,then get the confidence score of the tag based on the characteristics of the tag's relevance.The classification results of the actual characteristics of the samples and the label correlation characteristics are integrated to realize the prediction of multiple labels.This method considers local tag dependencies,extracts the tag correlation features from the neighbor instances,corrects the tag correlation prediction results based on the reliability of the neighbor instances,synthesizes the neighbor tag correlation features and sample features to improve the classification accuracy,and finally analyzes the time complexity needed to increase the tag relevance features.After testing benchmark datasets with different parameters in different fields,the accuracy rate of the algorithm after adding the neighbor correlation characteristics was compared and analyzed.The test results showed that after adding the neighbor correlation characteristics,the Ranking Loss and the Average Precision increased by up to 30%,Coverage increased by up to 60%,F1-macro and F1-micro all increased by up to 11%,and the accuracy of the MLNB algorithm evaluation indicators are higher than BR,CC,RAKEL three multi-label algorithm.
Keywords/Search Tags:Multi-label classification, Label correlation, Neighbor label correlation features
PDF Full Text Request
Related items