Font Size: a A A

Application Of Apriori Algorithm And The Bayes Classifier In Multi-label Learning

Posted on:2014-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:X J TangFull Text:PDF
GTID:2268330401979378Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Multi-label learning and application is a new hot issue in machine learning and datamining recently. In Multi-label learning, the training set is composed of instances eachassociated with a set of labels, and the task is to predict the label sets of unseeninstances through analyzing training instances with known label sets. Multi-labellearning is widely used in text categorization, webpage classification, nature sceneclassification and the classification of gene functions in Bioinformatics. Researchingon Multi-label learning has its practical significance and application value.Researchers have proposed many efficient solutions and methods. For example,neural networks, ID3algorithm and KNN method application in Multi-label learningframework. But it still have many problems worthy of study.This paper mainly researches on application of Apriori algorithm in Multi-labelclassification and classifying Multi-label data based on Na ve Bayes Classifier (NBC).The former is researching on proposing Apriori algorithm to search the relationshipbetween all labels. The latter is using the traditional na ve bayes classifier which isextended to Multi-label learning.First, in the iteration process of generation frequency item sets, compound labelswith strong association are replaced by existing single labels. And then usingML-KNN algorithm to classify Multi-label data. Finally, in the stage of predict labels,compound labels are filled basing on the relationship between labels. Experimentson emotions data show that this method is more effective than the existing Multi-labellearning methods.Training and testing procedures are adapted to the characteristics and assessmentcriteria of Multi-label learning problem. The principal component analysis method isintroduced in the data preprocessing to reduce the feature vector dimension ofMulti-label data set. It decreases experimental working and improves theclassification accuracy. This method is extended to the based on mutual informationand conditional mutual information measure of tree classifier TANC. Finally, theadapted NBC is realized through programming on MBNC experimental platform andapplied to the nature scene classification, the results show that it is effective.
Keywords/Search Tags:Multi-label learning, Data Mining, Association Rule, Na ve BayesClassifier, machine learning
PDF Full Text Request
Related items