Font Size: a A A

The Application Of Data Mining In Smart Phone Sales Data

Posted on:2020-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:J D HuFull Text:PDF
GTID:2428330623456628Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
As the most widely used electronic device,the mobile phone is one of the most popular consumer goods,and it plays an import role in online shopping at the same time.In modern society,everyone has a mobile phone.In addition,the mobile phone has a tendency to gradually replace traditional wallets and bank cards.Every year,a large variety of mobile phones are sold through online channels or physical stores.However,the sales of different mobile phone products are very different.What are the factors that affect the mobile phone sales? This is a matter that the sellers of mobile phones pay attention to and it will be discussed in the following thesis.This thesis first briefly introduces the principles of web crawling technology,then uses web crawling technology to crawl all the detail data of mobile phones from a large e-commerce website,such as various parameter configuration information,sales volume,number of comments and so on,then uses different ways to clean the data and extract various variables for subsequent modeling analysis.Firstly,the information value and the Spearman correlation coefficient are used to analysis the factors affecting the sales level of mobile phones.In order to predict the sales level of a particular mobile phone product and taking data into consideration,several machine learning methods like the decision tree algorithm,Bagging algorithm and random forest algorithm are used for modeling analysis.The cross-validation and grid search are used to select the optimal hyperparameters of the random forest model.Comparing the results of various algorithms,it is found that the random forest algorithm is better than other algorithms.When the appropriate hyperparameters is used,the result of the random forest algorithm is obviously better than that of using its default parameter.When the variable is reduced,the random forest algorithm can still maintain high accuracy and high AUC value.
Keywords/Search Tags:Web crawling technology, Decision tree, Bagging, Random forest
PDF Full Text Request
Related items