The Purchase Prediction Of Kidswant Users And The Analysis Of Important Characteristics Under Different Product Categories

Posted on:2022-08-06

Degree:Master

Type:Thesis

Country:China

Candidate:X W Ji

Full Text:PDF

GTID:2517306722481864

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

With the continuous development of Internet and machine learning technology,more and more companies seek business opportunities and operation direction from the company’s historical big data through data mining technology.As the main consumers of companies,data mining from the perspective of user data is the main research direction of major companies.For online e-commerce platform,it is more important to obtain methods from user data to maintain user groups and manage user relationships.As a maternal and infant online sales platform,in order to obtain its own competitive advantage,kidswant app must empower data and obtain value and income from data.In this paper,firstly,feature cleaning and missing value filling are carried out on the data set,and then category feature coding is carried out,and continuous features are processed by principal component analysis,logarithmic transformation and standardization.On this basis,the logistic regression model,xgboost model and catboost model were used to fit and screen the data sets under the four methods of over sampling,smote over sampling,under sampling and cost sensitive to deal with unbalanced data,and the accuracy,AUC,recall and F1 of each model were obtained.Finally,the performance of xgboost model after under sampling and feature screening is better,and the recall rate on the test set reaches 86.18%,which shows that 86.18%of the positive samples are predicted correctly,which is in line with the goal of improving the ability of the model to predict positive samples.Then,this paper uses four methods to deal with the unbalanced data in xgboost and catboost for different commodity data sets,and carries out model fitting.The important characteristics of the optimal performance model are compared and analyzed in xgboost and catboost for different commodity data sets.Finally,the important characteristics of the two algorithms are combined and analyzed,which gives some suggestions for the data operation of kidswant app suggestions.

Keywords/Search Tags:

kidswant app, user purchase forecast, unbalanced data, undersampling, XGBoost

PDF Full Text Request

Related items

1	Research On User Purchase Behavior Prediction Based On LightGBM
2	Research Based On FastText And Unbalanced Data
3	Research Of Imbalanced Data And Its Application
4	Research On Unbalanced Data Classification Based On Ensemble Learning
5	Research On The Statistical Machine Learning Method Of Mobile Phone APP False User Identification
6	A Research On Rumor-publishers Recognition Of Microblog Oriented To Network Public Opinion Control
7	A Study On The Statistical Methods Of User Images In The Background Of Big Data
8	Sales Forecast Of University Canteen Window Based On The Two-layers Model Of Sales Forecast
9	The Application Of KNN Classification In Unbalanced Data
10	Prediction Of Shanghai-shenzhen 300 Index Based On XGBoost-LSTM Neural Network