Font Size: a A A

Research On Classification Algorithms Based On Enhanced Semantics And Random Walks

Posted on:2012-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhengFull Text:PDF
GTID:2248330362468182Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid growth of the Internet, data classification problem has been at-tracting more and more attentions in these years. Diferent types of data may requirediferent classification methods. In this paper, we propose a classification method en-hanced by semantic information towards review mining tasks and a random walk basedclassification methods for multi-label mining problems. The main contributions of thispaper are as follows.For the review classification problem, we develop a solution called SeMep, inwhich the contents of the reviews and the semantic information related to the objectsbeing reviewed are both utilized to enhance the prediction output. We also develop aheuristic algorithm to build the classifiers based on semantic information. By evaluat-ing the diversities of the classifiers, SeMep combines the outputs of multi-classifiers.We then introduce an optional rule-based semantic postprocess to adjust the predictedresults on some classes.Towards the multi-label problems, we present a ranking and classification methodbased on the random walk model, called MLRW. MLRW maps the multi-labeled in-stances into graphs, on which the random walks are carried out. For unlabeled in-stances, MLRW predicts the probability of each label with conditional probabilitymodel. By transforming the original multi-label problem into some single-label sub-problems, it predicts the existence of each label.We design and implement the prototype systems of SeMep and MLRW basedon Weka and carry out extensive experiments. Experimental results show that SeMeppredicts the content of music reviews efectively and efciently. The experiments ondiferent types of public data sets are carried out to make comparisons between MLRWand many state-of-the-art multi-label methods. Experimental results on diferent kindsof applications illustrate that MLRW is more efective and efcient than these methods.
Keywords/Search Tags:enhanced semantic, classification, multi-label, random walk
PDF Full Text Request
Related items