Font Size: a A A

Study On Sentiment Analysis And Preference Mining For Large Scale E-commerce Reviews

Posted on:2019-04-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:T LiFull Text:PDF
GTID:1319330569987563Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and mobile network technologies,ecommerce,especially mobile commerce,have penetrated into all aspects of our lives.Analysis of user's behavior on e-commerce platforms has become one of the key contents of analyzing consumer behaviors.Review data,which can be openly accessed on large ecommerce websites,are important sources for user behavior analysis.However,such data are usually extremely large in size and susceptible to the socio-economic environment,presenting strong dynamics and complexity.How to analyze the user's preferences from the massive textual reviews,extract the product clusters in which the users are interest,and the satisfaction and dissatisfaction aspects of products,become the crucial factor to improve the quality of goods and services,grasp the social trends,as well as basic and key issues that must be faced and resolved when conducting precision marketing.This paper collected about 8 million reviews on electronics and books on the Amazon.com,and aimed to conduct “Sentiment Analysis” and “Preferences Mining” in these data.This paper first analyzed the characteristics of the review data and the challenges in data mining,and then proposed a series of data pre-processing methods,i.e.,feature engineering methods,to improve the performance of the classifiers which are often used in sentiment analysis.In order to handle the dynamic and real-time characteristics of the review data,and analyze the sub-patterns,this paper also proposed an incremental and adaptive classifier for opinion mining.In order to mine the dynamics of users' interests,this paper built “user's interest towards items groups” as time series,and then used temporal fitting and prediction models such as ARIMA to portrait the dynamics of users' interests.This paper also explored the user's interest network which used the users as nodes and the similarities between users' interest time series as edges,and then conducted community detection in the network to further analyze the topology characteristics.Based on the results of users' interest time series mining and community detection,this paper then built a recommender system to recommend related item groups to users.The sub-sections of the paper are as follows:(1)Feature engineering: linear and nonlinear space transformation methods.In order to solve the mining challenges,such as the complex distribution of sub-patterns in review data,co-relations in features,and large scale in volume,this section proposed several linear and nonlinear space transformation methods,including singular value decomposition,distance metric learning,Nystr?m method,and two integrated space transformation methods.Experimental results showed that the proposed methods significantly improved the accuracy of traditional classifiers,such as k-nearest neighbors,support vector machines,logistic regression,and linear discriminant analysis,and decreased the training time on large-scale reviews.In final,the efficiency and effectiveness of sentiment analysis can be improved by these feature engineering methods.(2)An incremental and adaptive classifier for opinion mining.For traditional k-NN,SVM and other classifiers,it is difficult to obtain the distribution characteristics of data,and it is difficult,if not impossible,to analyze the sub-patterns of user viewpoint in review data.This section proposed a new classifier based on competitive learning: AdaHS.The classifier is suitable for review data distribution in which sub-patterns are complex.In order to enhance the classifier's ability to adapt to the complex cluster boundary,this section also proposed the kernel transformation version of the classifier: Nys-AdaHs.Experimental results showed that the classifier is highly accurate,and has supervised clustering capabilities.It has practical value for application scenarios such as opinion mining in reviews,analysis of users' sub-pattern of viewpoints,and tracking and improvement of goods and service qualities.(3)Time series analysis of users' interest towards item groups.This section mainly mined the user's preference contained in the review data.The reviews in this section were viewed as intermediate data linking users and product items.By clustering product items into groups,and counting the number of user's reviews on each group at each month,this section built time series of user's interest towards item groups.It then used time series fitting models to portrait the dynamics of users' interests.Finally this section performed user's interest prediction,evaluated the accuracies,and discuss the prediction results.(4)User interest network and recommender system.This section mainly analyzed the group characteristics of the users' dynamic interest series.It proposed a similarity measure for user's interest series based on dynamic time warping,then built a network of users' interests,and introduced “Fast Unfolding” method to conduct community detection.Based on the prediction results of the users' interest in the previous section and the findings of the interest communities in this section,it presented two basic recommendation strategies and built a recommender system.The experiments result showed that the users' interest network has significant community structures,and the performance of the newly proposed recommendation system outperformed the Spark's built-in ALS-based "collaborative filtering" recommender method significantly.The proposed methods for user behavior mining in this paper also have certain theoretical reference value for data mining in other fields.For practice in e-commerce companies,they can be used for consumer behavior analysis and precise marketing seamlessly.
Keywords/Search Tags:data mining on customer reviews, sentiment analysis, opinion mining, preference mining, time series mining
PDF Full Text Request
Related items