Font Size: a A A

Feature Selection And Regression Prediction In Complex Dataset

Posted on:2018-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:G Q CuiFull Text:PDF
GTID:2428330590977608Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of the Internet,the complexity of data growth.Complex data not only means that the amount of data increases,but also means that the number of features and variable attributes increased.How to extract effective information from complex data presents a great challenge to the research of feature selection and regression prediction.Traditional feature selection methods are mature in dealing with supervised classification problems,but there is still a lot of space for unsupervised or regression problems.On the other hand,in regression prediction algorithm,the integration method and the partition "scene" modeling have been widely used.The work of this paper focuses on feature selection and regression algorithm.(1)This paper presents an unsupervised feature selection method(UFSSR).UFSSR uses sparse representation to reconstruct the data matrix and proposes a new feature evaluation function based on it.Experiments on classical datasets show that UFSSR can select more important features.(2)Feature selection and regression prediction can be regarded as two important steps in dealing with regression problem.In this paper,an improved feature selection method based on random forest(RGRF)is proposed.RGRF uses ridge regression to improve the result of feature rank via random forest.In regression prediction algorithm,this paper designs an ensemble regression algorithm based on the Stacking framework,which combines the results of random forest,GAM and GBDT.Subsequently,this paper verify the RGRF and integrated regression prediction algorithm via a specific communication target prediction problem.Experiments show that the RGRF method and the ensemble prediction method proposed in this paper can improve the precision of regression prediction effectively.The prediction result of communication target: 0.85 for R-Square,55.60% for S.D(0.2).(3)In order to further improve the precision of communication target regression prediction,this paper will take into account the configuration of the communication equipment node itself and build hierarchical scene partition model.In each sub-scene,the communication target was predicted alone.Experiments show that,after the scene after the division,the overall prediction accuracy has been significantly improved.In the end,the prediction precision of the communication target: 0.92 for R-Square,67.86% for S.D(0.2).
Keywords/Search Tags:feature selection, ensemble regression, scene partition model
PDF Full Text Request
Related items