Font Size: a A A

Research And Implementation On Online Reviews Of Product Based On The Dependency

Posted on:2014-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y YanFull Text:PDF
GTID:2248330395481085Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, more and more people learn products they want to buy and make choice through surfing on the Internet. On the other hand, in order to make customers convenience, businesses want to list the advantages and disadvantages of every product for them. Otherwise, manufactories can get customers’ feedbacks and advice from the products’ reviews and improve their products and services. Because the reviews are numerous and flexible, reading by hand will consume much time and can’t update timely. These requirements show that mining online reviews of product automatically is necessary and it has been one of the hot researches in the information process.Generally, data mining on online reviews of product includes four parts:mining product feature in the reviews, mining product opinion describing the feature, distinguishing emotional tendency of the opinion and ranking the opinion by importance.In the product reviews, the description of product features tends to use multiple words to limit which attribute or component they want to represent. The product features extracted by the traditional mining methods only contain a product attribute or component and ignore these limited relationship. In this paper, we use the grammatical dependency to extract the product attribute and the words that describe it, then compose them as an integrated product feature. At the same time, we convert the traditional review extraction into a sequence labeling problem and propose a grammatical dependency and CRFs learning approach for integrated product extraction by combining Conditional Random Fields(CRFs) with the grammatical dependency. After extracting, we extract the opinion words corresponding with the feature by the dependency and finally we distinguish the polarity of the opinion by HowNet. With the experiments of online product reviews, we obtain good precision and recall.The works we have done in this paper are followed.Firstly, we research the existing mining algorithm of product reviews and transfer the traditional reviews mining problem into a marked sequence problem. In this paper, we use F to represent the product feature, O to represent the product opinion and B to show other words in order to mark the user reviews. Like that, we can rewrite the review statement by the sequence composed of these three elements, F, O, and B. In recent years, conditional random fields (CRFs) are widely used in sequence annotation and can achieve good results.Secondly, both rule-based and statistical-based algorithms are the traditional feature recognition. The former can get a higher accuracy, but it depends on the order of the words. However, in our paper, we use the grammatical dependency and combine with conditional random field to mark sentences. After grabbing the reviews from the Internet, we conduct experiments to mining the effective information. The result shows that it can get a higher precision and recall and it is domain-independent.Thirdly, the product feature is often paired with the product opinion. So we extract emotion words corresponding product feature words by grammatical dependency. Then we determine the polarity of these emotion words through the Chinese polarity dictionary such as HowNet. And we describe further the classification method of product opinion. At last, based-on the.net4.0platform, we design and implement a polarity recognition system of product reviews. During data collection period, we gather the reviews of Jingdong Mall and51buy. Then segment, POS tag and get the dependency by existing text processing instrument. And we get automatically mark results by CRFs method. Analysis the result in detail, we can achieve the product feature and opinion that we want. The system can display the extent of advantages and disadvantages of the product to the user.
Keywords/Search Tags:Online reviews, Grammatical dependency, CRFs, Polarity recognition, Datamining
PDF Full Text Request
Related items