Font Size: a A A

The Research Of Mobile Clients' Network Comments Based On The Data Mining

Posted on:2017-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:S H FengFull Text:PDF
GTID:2348330488475594Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Along with the development of the mobile Internet era, more and more customers get interested in publishing and sharing evaluation about their purchasing behavior on network platform, as a result, amounts of data about the clients'information and the network comments are stored in the network platform. However, if the entrepreneurs want to improve their operation efficiency and competitiveness, they must dig out the useful business information from the platform. In this paper, the author adopted the methods of data mining to study the network comments of mobile clients', the main work and the relevant conclusions are as follows:Firstly, using the web crawler technology with Gooseeker software, the author studied the URL capturing rules and the rules of data acquisition based on the comments of the mobile users'on Huawei mobile phone's official website. The author collected about 2000 comments on the selected website which were stored in XML format, then used the Swift and Excel software to preprocess the original data set to remove reluctant comments, ending up with getting 1473, nearly 60000 words of online comment as sample data set.Secondly, based on the visualization technology and LDA topic model, using R and ROST CM 6 software, the author analyzed the characteristics of the comment text. The visualization analysis were conducted mainly from two aspects, that is, word cloud and semantic network, and obtained some related information about word frequency and the customers comments on the advantages and disadvantages of the products. There are advantages like nice appearance, timely delivery, but also disadvantages like battery heating, not resistant to use, the deficiency of the poor battery life. Based on the characteristics of the LDA theme model analysis, the author studied the first eight topics that customers care, namely, "workmanship, handle, earphones, design, experience, storage, battery, and packaging". Through the analysis of the LDA topic model in different months, the author got the subjects of customers'comment were changing with the change of time.Thirdly, using the method which based on emotional dictionary to analyze emotional value, and using the Python programming language, the author calculated the emotional value of each comment. Then the author judged the emotional tendency of the customers'comments. According to the descriptive statistics analysis, the emotion values of about 21.1% of the customers' comments were higher than positive comments on the emotion mean values, consistent with the 80/20 rule, that is,20% of loyal customers tend to provide 80% of the profits to the enterprise. Further, based on the LDA topic model analysis about the reviews properties and the hot spot in the visual word frequency statistics, the author analyzed the customers emotional tendencies on each concerns of customers', and got that the clients has the highest percentage of negative feedback on accessories (25.41%), in addition, the proportion of negative feedback on services and system is higher too, respectively reaching 23.44%,19.70%.Fourthly, to divide the customers into two groups of high value and low value, combining the collected data, the author adopted the customer level, rating, positive emotional value, negative emotion value, positive emotion variance and negative emotion variance as indicators of customer characteristic segmentation, then adopted Two-Step clustering algorithm to group the customers into key customers, major customers, common customers and small customers four categories and the high value customer occupied 18.3%, roughly reflecting the Pareto principle (2-8's law). And then, on the basis of customer segmentation, more orderly classification Logistic regression prediction model is established, and prediction accuracy of the model of customer types is 97.62%, showing that the prediction effect of the established model is good, and can be used for customer types of projections for new samples.Finally, based on the results of the data mining for mobile client network comments, the author summarized the study. According to the summary, this thesis put forward some suggestions for the customer marketing and further researches, in order to provide references for enterprises, merchants.
Keywords/Search Tags:data mining, the network comments, visualization, text sentiment analysis, customer segmentation
PDF Full Text Request
Related items