Font Size: a A A

Sentiment Analysis Model Design And Evaluation On Chinese Product Comment Data

Posted on:2018-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2348330518493288Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Admittedly, the Internet has become a part of people's daily lives,people are increasingly inclined to use a variety of Internet applications for entertainment, consumption, study and work and so on. With the rapid development of the Internet industry and continuous improvement, all kinds of Internet products have opened a comment function, a large number of product reviews data also produced.With the aid of the emotional analysis model to deal with these comments, could be efficient to help business organizations to understand the user's emotional attitude of the product and product problems that may exist, and constantly improve theproduct function or performance, improve service quality and attract more users. Based on the characteristics of Chinese review data, an emotion analysis model is designed in this paper. The model can be used to classify the emotional polarity of the comment data and to dig out the types of problem that may exist in the user's reflection.Firstly, this paper introduces the research background and significance, domestic and foreign research status and the study work of this subject. Then it introduces the characteristics of the product review data and the corresponding data preprocessing work. Then, the related theoretical knowledge about the construction of the emotion analysis model are also described. At last, this paper introduces the emotion analysis model, which is divided into two parts: emotion classification and highlight problem mining.(1) The CHI feature extraction algorithm, C-TF-IDF feature weight calculation algorithm that proposed in this paper and support vector mechanism are used to construct the emotion classification. Besides, the model training method and the related emotional classification strategy are also described. C-TF-IDF is based on the emotional dictionary constructed in this paper. It can compensate TF-IDF algorithm can't distinguish between the emotion word and the non-emotion word. The effect of one-step trisection and two-step dichotomy strategy on the final classification effect is analyzed and compared. It is found that the one-step trisection method is more suitable for the emotion classification of product reviews data. Then, based on the one-step tri-classification method, the final classification results of C-TF-IDF and TF-IDF are compared and analyzed under the different proportion of CHI. The final experimental data show that C-TF-IDF is better than TF-IDF for emotion classification task of Chinese product review data, The lowest F_Score was increased by 1.584%and the maximum was improved by 2.267%.(2) On the basis of emotion classification, this paper propose a mining method of prominent problem points based on rule matching law. This method firstly locates the problem type based on the rule matching method,then excavates the prominent problem points based on the text clustering algorithm. The rule matching method is to use the regular expression to search the key words corresponding to the product problem type that may exist. This method does not need another training model, not only it is simple and feasible, but also it can be extended. Based on text clustering,the prominent problem point mining algorithm considers the interrelation between the review data. It takes the clustering cluster as a unit to solve the problem, and then find out the outstanding problem. To some extent, this highlighting problem mining algorithm can work stably in data sets with extreme problem distribution and conceptually extensive problem types.
Keywords/Search Tags:product review data, emotion analysis model, weight calculation, rule matching, text clustering, highlight problem mining
PDF Full Text Request
Related items