Font Size: a A A

Exploration B2C Account Online Review Features

Posted on:2015-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y X DuanFull Text:PDF
GTID:2268330428970137Subject:Statistics
Abstract/Summary:PDF Full Text Request
October4,2013, online product reviews information is extracted byjava-implemented spiders. Furthermore, based on collected32,333accounts and229,530reviews aggregated by stored procedure,8382accounts online reviewsfeature is generated as the study sample in the paper. Finally, from four aspects ofstatistical description of online account review characteristic, one-way analysis ofvariance on different account levels, correlation analysis on inner characteristic ofaccount online review and that is related to account level, clustering analysis onaccounts online review feature, it is to be researched and mined for account onlinereview feature. The analysis results are as follow:First, according to conventional statistical description theory, statistic of accountonline review characteristic, density histograms and kernel density estimate are got inorder to describe account review characteristic. As the result, it was found thatrecently comment published account participation in online review is not enoughactive, total number of comment third quarter of comment published account haspublished under25, comment published account inclined to hit high score, exchangefor each other near freezing, using the number of word as few as possible and mannertending to be cautious.Second, according to the theory of one-way analysis of variance and theorieswhich are related to it, it is analyzed for account review on different level. As theresult, total number of comment for each account, average interval between purchaseand comment are increasing with account level increasing; the number of daysbetween the latest comments and information extraction is increasing with accountlevel decreasing; diamond account above is higher than others in mean score for eachaccount; register account is lower than others in average number of words for eachaccount on experience.Third, the theory of Correlation-based Feature Subset Selection is used in order toevaluate subsets of online account review features that are highly correlated with theclass while having low inter-correlation are preferred. Finally, three variables thenumber of days between the latest comment and information extraction, total numberof comment for each account and reply ratio for each account meet above-mentionedcondition. The theory of factor analysis is used to get common factors of account review characteristics. The conclusion is drawn that linear combination of the numberof days between the latest comment and information extraction, total number ofcomment for each account, reply ratio and average interval between purchase andcomment is regarded as the first common factor which represents the degree ofparticipating in review, and that linear combination of mean score for each accountand average number of words for each account on experience is the second commonfactor which represents the degree of satisfaction.Fourth, three kinds of clustering algorithms CascadeSimpleKMeans, XMeans,EM are applied. The operating results for the three algorithms were compared clusterthrough evaluation criterion in view of log likelihood. Conclusions are that EM ischose to got9types of people. Furthermore, types of clusters and account levelcontingency table is constructed to analyze relationship between the two above.According to the analysis result of the four above, four-point findings andrecommendations are proposed. One, to clear pairwise relationship between accountreview involvement, account satisfaction and account level. Two, to enhance thereview enthusiasm of target population who loss review enthusiasm. Three, pay moreattention to bad shopping experience account. The last but not the least, to improveexisted review incentives.(because of frequently changing in JD.com, please paymuch attention to section1.5)...
Keywords/Search Tags:product review feature for each account, analysis of variance, correlation-based feature selection, factor analysis, cluster analysis
PDF Full Text Request
Related items