Font Size: a A A

Clustered Covariate Regression And Its Application

Posted on:2021-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2480306503491394Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
China’s Internet is the world’s largest network,which also leads to many problems that need to be solved with data.However,data modeling and analysis will encounter a special problem.How to coordinate multiple collinearity in regression modeling and model interpretability is a problem that has been studied and solved in many regression models.Accurately predicting the number of monthly active users of the app in the future and establishing the index board of the influencing factors of the number of active users in the future can dig the potential of the app to the extreme,reduce the production of useless apps,save enterprise capital and labor costs,and improve work efficiency.In order to solve this problem,this paper studies a clustering covariate regression model(CCR).This model can produce a nonsingular gram matrix even when the variables that lack independent mutation are projected into the same cluster or used with other covariates(with independent mutation).In linear and nonlinear models,CCR is feasible as long as the model allows linear predictors.For a given model,CCR eliminates the choice of adjustment parameters and penalty functions.In addition to sparsity,CCR also requires that the high-dimensional parameter space be reduced to a smaller identifiable parameter space.In addition,CCR allows parameter inference,regardless of the size of the parameter;however,the shrinking method cannot resolve the inference of the covariate thrown.CCR estimation takes into account the model,results and covariate information.Like the punitive regression method,the CCR results are interpretable,and unlike the full dimensionality reduction method,inference is feasible.Through the analysis of our examples,the results obtained by using CCR are compared with the results obtained by using multiple linear regression and random forest to evaluate the advantages and disadvantages of the clustering covariate regression method.The study found that the goodness of fit of the CCR is 0.880,the goodness of fit of the random forest is 0.881,and the multiple linear regression is only 0.843.It can be found that the clustering covariate regression also retains the parameters without loss of prediction accuracy.Explanatory,we can establish a readable data indicator kanban for us to monitor,and we can accurately predict the number of monthly active users in the future.
Keywords/Search Tags:clustered covariate regression, app mau prediction, multicollinearity, random forest
PDF Full Text Request
Related items