Font Size: a A A

A Research On Unsupervised Methods For Aspect-based Opinion Mining Based On LDA

Posted on:2020-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:J T FengFull Text:PDF
GTID:2428330599958572Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The rapid development of the mobile Internet and the popularity of smartphones provide favorable conditions for people to comment at anytime and anywhere.The social platforms,such as Tweet and Weibo,and online shopping platforms such as Taobao and Amazon,offer people the chance to make comments on different commodities in different domains.Effective analysis of these reviews can assist manufacturers in making sales and future development decisions,and can also help consumers to select products that meet their expectations.But simply making sentiment polarity judgments on a review does not provide effective information to the user,it is necessary to first extract the opinion target of the review and extract the category of the review.The aspect-based opinion mining is aimed to extract the aspect-based terms and the aspect categories,which has important research significance and value.However,a large number of reviews involve a wide variety of products,and the process of labeling the required data is cumbersome.It will take a lot of resources to establish a standardized annotated corpus for reviews for all products.Supervised methods that rely on annotated data sets will be difficult to apply to the field of reviews that lack annotated corpus.How to improve the unsupervised model's effect,and make the model adaptable to the field(including different fields and different languages)is a topic worthy of study.Based on the LDA(Latent Dirichlet Allocation)topic model,this paper proposes an unsupervised model SLDA(SentiWordNet WordNet-Latent Dirichlet Allocation)and HME-LDA(Hierarchical Clustering MaxEnt-Latent Dirichlet Allocation)for aspect-level opinion mining.A scheme which uses seed words as the subject words and an inverted index was designed to enhance the readability of the results.At the same time,based on the LDA topic model,new variables are introduced to refine the classification of the topic,in order to classify the aspect-based terms and the opinion word.In order to improve the classification effect,the similarity between words and seed words is calculated in two ways to offset the fixed value parameters in the standard LDA.Based on the SemEval2016 ABSA dataset and the Yelp dataset,a comparison experiment based on different size training sets and different seed words was designed.The experiment proves that the SLDA model and the HME-LDA model have better performance on unlabeled training set.
Keywords/Search Tags:Opinion Mining, Aspect-Based-Term Extraction, Latent Dirichlet Allocation
PDF Full Text Request
Related items