Font Size: a A A

Research Of Internet Review Opinion Extraction Methods

Posted on:2020-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:H ShenFull Text:PDF
GTID:2428330575955056Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
In Internet era,people have more activities online,like shopping,ordering take-out,booking hotels and so on.The largest e-commerce website Alibaba displays countless reviews from buyers.The biggest life-service online company Meituan,holds a large amount of take-out and food reviews as well.Also for Ctrip,the largest OTA(Online Travel Agency)company in China has innumerable reviews of hotels and travelling experience.Nowadays,data has become the most valuable fortune for every Internet enterprise.Referring to the report from CNNIC(China Internet Network Information Center)in 2016,71.1%users put product reviews as the leading factor in their purchase decision.Also such abundant and various review data is the gorgeous raw material to analyze each variety of goods value and provider quality.However,the sophisticated sentence patterns and frequent disputes in raw review data and its large amount make direct analysis impossible,which calls for corresponding methods to transform the data into a three-element tuple,i.e.,opinion aspect,opinion word and sentiment orientation.After that,it becomes convenient and direct to handle and analyze this kind of constructed data.This article makes research into two aspects,one is the extraction for opinion aspect and opinion word,another is to make sentiment orientation classification.The main points and attribute list below:1)Constructing two opinion extraction modules,a hard match and a soft one.The hard match means to make a search between the review text and the seed dictionary,which would be successful if one item in the dictionary is matched in the text sentence.The soft match is to generate an augmented dictionary by word2vec algorithm.And when the item in this dictionary is found in the re'view,the item will be normalized to the seed of this item in the hard-match dictionary.The essence of soft match is the similarity of word vectors.2)Making the third opinion extraction module using conditional random field.If nothing is extracted from the first and second module,the third one labels review sentences with BIO method and conditional random field model based on stacked bi-direction long and short memory network.This part is to extract those opinion aspects and opinion words which appear in the review sentences but not in the two dictionaries before.3)Forming the fourth extraction module using pointwise ranking.If the three modules before still have not extracted any opinion aspect and opinion word,the module here is to find out with a ranking system the most corresponding but latent opinion aspects and opinion words of the review sentence.Also the ranking system uses stacked bi-direction long and short memory network to learn the representation of review sentence.4)The sentiment orientation reflected from the Internet review sentence gives an important response to the providers of services and products.After opinion extraction,this article also learns the word vectors of each word in reviews,then makes features through deep learning network,finally outputs the distribution of sentiment types after full-connect layer.In order to verify the validation of those Internet review opinion extraction methods,this article selects three test data sets of different types in hotel,beauty supplies and traveling.After the test,the result shows the feasibility of these methods to be used in Internet review opinion extraction.
Keywords/Search Tags:Natural Language Processing, Opinion Mining, Sentiment Analysis, Conditional random field, Pointwise ranking
PDF Full Text Request
Related items