Font Size: a A A

Opinion Mining Based Sentiment Analysis For Online Products Reviews Research And Application

Posted on:2013-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y X FanFull Text:PDF
GTID:2218330371455859Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet makes online shopping more and more popular, which greatly changes the model for consumption. The feelings for people to goods and the process of shopping spread not just by word of mouth but also by the online reviews. Then the online reviews are important not just for consumers but also for the producers. The paper seeks to analyze the online reviews and extract the attitudes and emotions of people to the goods, further more, helps consumers choose products and producers improve quality of products.Generally, sentiment analysis, opinion mining based, treats the texts or sentences as the collection of words, phrases or patterns. To calculate the sentiment of the words, phrases or patterns, the value of the sentiment of the texts or sentences could be calculated out. There are four steps for sentiment analysis:data collection, text preprocessing, sentiment identification and the result show.The paper studied the existing methods for sentiment analysis deeply, and crawled the online reviews from the Jingdong Mall. Through the data analysis and statistics, it summarized the characteristics of the data, and then presented the algorithm of extraction and merge for POS patterns (POSEM). With the algorithm, the effective POS patterns were extracted from the training data set. According to the characteristics of the POS patterns, the paper designed the rules of pattern matching, and finally, extracted the title words and opinion words from the test set. Then the sentiment of the online reviews was got. The experiment showed that the proposed method achieved a higher precision and recall rate.In this paper, our work is as follows:1. With the theoretical study of the existing text sentiment analysis, this paper conducted in-depth analysis and statistics, and summed up the characteristics related with the sentiment analysis:in the comments sentences, the adjective contributes the most for the sentiment analysis, the rate of the number of which to the total is 86.87%; the noun and adverb are followed, the ratio reaches 71.64% and 70.79%; the other part-of-speech, such as verbs, prepositions, has also an important role for sentiment analysis.2. With the analysis on the data of the production reviews, this paper designed the algorithm of extraction and merge for POS patterns (POSEM). The algorithm marks the POS pattern with "POS\T\O", and sets a length threshold, the frequency threshold and upper and lower probability threshold by the length, the number and the probability of POS patterns. The POS patterns which meet the lower probability threshold will be used to negate the subjective. The extraction algorithm extracts the POS patterns which meet all of the thresholds. Some POS patterns, which just meet the length threshold and the probability threshold, will be merged in order to get to meet all of the thresholds. This design could improve the recall of the sentiment analysis in some extent.3. With the analysis of the POS patterns which are extracted by the POSEM algorithm, the paper designed the pattern-matching rules, and the center-words and opinion words will be extracted from the test set. Then, the words with high-precision evaluation will be used to identify the remaining ones. The result of the experiment showed that the proposed method reached high precision and recall rate.4. A generic framework was designed and implemented, which can replace the components flexibly to meet the different needs for the experiments. In the preprocessing model, the system sets the uniform format for the POS tagging. When there is the need to replace the different word-segment tools, the system just needs to transform the POS format to the uniform one. In the analysis model, the system can replace the train, test and application components easily. Based on the framework, combining the open-sourcing tools, the paper designed and implemented a prototype experiment platform for text analysis. The system integrates the data collecting model, the text preprocessing model, the text sentiment analysis model and the result show model.
Keywords/Search Tags:text sentiment analysis, pos patterns, online production reviews, opinion mining
PDF Full Text Request
Related items