Font Size: a A A

The Decision Tree Algorithm Of Commodity Recommendation

Posted on:2018-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:L BaiFull Text:PDF
GTID:2348330533960836Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
With the government add "Internet +" into the thirteen-five program,internet industry accumulates a large amount of internet data.But the data easily lead to noise pollution,pseudo-correlation,and let the leaders make the wrong decision.Electricity business gain profit through a network platform transaction.The commodity recommendation system plays a key role in analyzing consumer behavior and predicting the products that consumers are interested in.At present,the algorithm of commodity recommendation system is collaborative filtering algorithm,but in the background of large data,the algorithm does not make full use of commodity and user characteristic information.In order to improve the predictive accuracy of commodity recommendation system,the paper constructed some gradient boost decision tree algorithms(GBDT)and then fusion these algorithms.This algorithm ensemble can extract useful information from the massive consumer behavior data,and precisely recommend the goods to the users.The main work of the paper:1.Collecting data and preprocessing.The data in this paper is the consumption data collected from Taobao,and the data has missing values,abnormal values and dimension inconsistency.For building model,we use mode or mean of the data to fill the missing value,and standardize data to ensure consistent dimensions.2.Feature selection.The selection of features is a very important step in the statistics,and the selected features must be independent.So,we use the principal component analysis method to select independent factors.3.Forecast model selection.We train the five GBDTs,then use the results of the models as 'variables' in logistic regression over the validation data of that GBDTs.The experimental results show that the recommended effect after fusion is obviously improved.
Keywords/Search Tags:Collaborative Filtering Algorithm, Principal Component Analysis, Gradient Boost Decision tree, Logistic Regression
PDF Full Text Request
Related items