Font Size: a A A

Research Of Product Feature Extraction And Sentiment Analysis Base On Chinese Online Reviews

Posted on:2017-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:L F ZhouFull Text:PDF
GTID:2348330491462607Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet applications, online shopping is gradually becoming an important component of Electronic Commerce. Online reviews are a significant asset data of e-commmerce sites, which are text set focused on personal attitudes and opinions related to the product. These data provides a huge potential value for online shoppers and business men. Obviously, reading massive online reviews only rely on manual comprehension can not be achieved, the emergence of review mining technology to solve this problem provides an effective resolution and become a hot topic in academic domain. The main research of review mining content contains feature extraction and sentiment analysis. The work of this paper can be described as follows:1) Build the repository of Chinese Online Reviews in the domain of electronic products. The customized reptile tools is used to automatly crawl electronic product reviews' html content from Jingdong and Taobao and analysis, then adopt the our comments filter criteria to do original review data filtering and washing?make segmentation?delete stop words, then store statistics word frequency into DB.2) Propose an efficient secondary pruning feature extraction algorithm for Chinese online reviews. For the low precision and recall of traditional sequential pattern mining algorithm, we combines traditional GSP algorithm and word pairs co-occurrence method based on statistic to achieve feature extraction and pruning, the resulting of feature set lays foundations for subsequent sentiment analysis.3) Do research of dependency patterns. With the help of semantic parser tool, we parse the reviews and obtain the frequency of different patterns, we construct 7 common and useful dependency patterns with POS combined with the semantic distance and prunctuation to extract features and related opinion.At last, we build a feature-based classification model with 11 features, and adopt SVM, logistic regression and Bayesian algorithm as the classifier, and make multiple experiment comparison with baseline model. Through the shuffle and sort, we obtains 5 most efficient features related results, the results demonstrate the effectiveness and ease use of the method.
Keywords/Search Tags:review mining, feature extraction, sentiment analysis, dependency pattern, feature classification model
PDF Full Text Request
Related items