Font Size: a A A

Opinion Target Extraction In Chinese Microblog Posts Based On Candidate Clustering

Posted on:2018-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2348330536488236Subject:Engineering
Abstract/Summary:PDF Full Text Request
Opinion target extraction aims to find the object to which an opinion is expressed.It is the most basic task in sentiment analysis.At present,most researchers focus on opinion target extraction in product reviews.The traditional approaches are through dependency relationship between opinion word and opinion target to identify opinion target.Microblog text has flexible expressions in colloquial style and often does not contain opinion words.So it is hard to explore traditional approaches that rely on opinion words in microblog posts analysis.In this paper,we extract opinion targets by mining the relations between sentences after analyzing linguistic characteristics of Chinese microblog posts.Firstly,this paper proposes an unsupervised opinion target extraction method based on candidate clustering.Based on microblog posts with the characterstics of the topic,the word clustering method is used to cluster opinion target candidates,and to find the aspects that are discussed in a hot topic.Based on the result of clustering and an idea that similar sentences are usually expressed opinions on similar objects in the same category.We further classify sentences to find similar sentences.Combined the similarity of candidate words and sentences similarity,a similarity-based iterative algorithm is explored to extract opinion targets from opinionated sentences.The experimental results show that opinion target extraction method based on candidate clustering improves the F-score by about 7 points and 4 points separately in strict and soft evaluation compared to the best results of other methods(ULP).Secondly,in order to filter the microblog posts without viewpoint information.This paper proposes a supervised opinionated sentence extraction method.After analyzing the differences between opinionated sentences and non-opinionated sentences in Chinese microblog posts.Based on the good features of existing research results and the linguistic features of microblog posts.This paper chooses opinion word,unigram,part of speech,adverb,verb,mood word and punctuation as features.The CHIsquare is used to select the text features to determine the optimal feature dimension and combination of features.And then SVM is used to extract the opinionated sentences.The experimental results show that part of speech,opinion word,adverb,mood word and punctuation help to identify opinionated sentences in Chinese microblog posts.Finally,this paper combines with the first two works to construct opinion target extraction system in Chinese microblog posts.According to the proposed opinionated sentence extraction method to filter non-comment text in microblog posts.And opinion targets are extracted from the identified opinionated sentences.The experimental results show that this system improved 4% and 3% of F-score in strict and soft evaluation compared to the sixteen teams participating in opinion target extraction task in CMSAE.
Keywords/Search Tags:opinion target, candidate clustering, sentence classification, similarity calculation, opinionated sentences
PDF Full Text Request
Related items