Font Size: a A A

Research On Predicting User Age And Gender Based On Mobile Atmlication

Posted on:2021-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:J Y DaiFull Text:PDF
GTID:2428330605974581Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the rapid progress and rapid development of modern science and technology in recent years,the popularization rate and usage rate of smart phones are continuously increasing,and people are increasingly relying on smart phones in their daily lives.At the same time,with the explosion of the number of APPs and the increasingly diversified and daily life of people,people often spend a lot of time on smartphone apps.The user's behavior in choosing an APP is determined by its characteristics.Due to privacy restrictions,it becomes more difficult to directly collect user information.Therefore,judging the characteristics of users based on their behaviors and habits has become another new research field and direction.Most studies tend to use classification methods to study APP,but the main research direction is about the behavior characteristics of mobile users.In addition,in the prediction classification of user characteristics,the model used is often a single classification model.More and more scholars will make full use of integrated learning and integrate multiple models to study how to solve the classification problem.This article builds a mobile user classification model based on data about mobile user APP usage in Kaggle competition data.The main content includes data analysis and description,feature extraction,feature selection,dimensionality reduction,basic model construction,and Stacking fusion model construction and analysis of model resultsIn this paper,the index of log loss is used to measure the prediction accuracy of the model,and the Q statistic is used to measure the difference between the base models.The two aspects of comprehensive accuracy and difference are combined.In the end,SVM,XGBoost,LightGBM,Random Forest,Neural Network,LightGBM,and k-Nearest Neighbor and logistic regression are selected as the base model of the fusion model among the seven models of network.The model of the second layer selects logistic regression and uses 5-fold cross-validation to train the data set to get the final result.After empirical analysis,comparing the single model and integrated learning(XGBoost,LightGBM,Random Forest)to see that the Stacking fusion model established in this article is more accurate.
Keywords/Search Tags:Mobile Application, User Characteristics, Multi-class Models, Integrated Learning
PDF Full Text Request
Related items