Font Size: a A A

Research On User Purchase Prediction Algorithm Based On Boundary Oversampling And Integrated Learning

Posted on:2022-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:W J DingFull Text:PDF
GTID:2518306734961569Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
As more and more users become accustomed to online consumption,e-commerce platforms have accumulated massive amounts of user behavior data and provided data conditions for user purchase predictions.To more accurately and efficiently filter out the products that users are most interested in from many products,predicting purchase intentions based on users’ historical behaviors has become an important part of the recommendation system of e-commerce platforms.There are many browsing and clicking behaviors in user behavior data,and the proportion of purchase behaviors is very small,and there is an obvious imbalance.If it is not processed,it will lead to serious forecast deviations.However,in the current related research,few scholars have deeply studied the imbalance processing of user behavior data,and most of them are still based on a single model to make predictions.In response to the above problems,this paper conducts research and improvement based on classic oversampling algorithms and builds an integrated algorithm for user purchase prediction based on boundary oversampling,which solves user behavior by synthesizing new minority category samples for samples at the classification boundary The extreme imbalance of data is common,and a two-layer integrated model is constructed in combination with the Stacking learning method.To verify the robustness of the performance improvement of the improved algorithm,this paper sets up several simulated data with different positive and negative sample ratios and compares the performance of boundary oversampling with traditional data equalization processing algorithms on the simulated data.Algorithm training is carried out on the e-commerce platform data to further verify the improvement of the improved algorithm based on boundary oversampling and integrated learning compared with the conventional algorithm.The experimental results show that the use of boundary oversampling not only has a significant improvement effect compared with the traditional method,but also has a better performance on unbalanced data than random oversampling,and when the data imbalance is more extreme,the boundary is too high.The more obvious the improvement of sampling,it can be seen that boundary oversampling can deal with imbalanced user behavior data more effectively and can make full use of the behavior information of purchasing users.In addition,compared to a single model,the fusion model using the Stacking integrated learning method has a significant performance improvement compared to the commonly used single model,whether it is on simulated data or real user behavior data sets,and can be more accurate and efficient Predicting whether users will make a purchase in the future is conducive to improving precision marketing efficiency.In summary,the improved user purchase prediction integration algorithm Stack_bsm based on boundary oversampling in this paper is better than other algorithms in most cases,and it is an effective user purchase prediction algorithm.
Keywords/Search Tags:Imbalanced data, User purchase prediction, Integrated learning, Recommendation system, E-commerce
PDF Full Text Request
Related items