Font Size: a A A

Research And Application Of Data Mining In Precision Poverty Alleviation

Posted on:2020-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:H Y MaFull Text:PDF
GTID:2438330575955712Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The essence of precision poverty alleviation is that the government effectively identifies poor families and members,taps the causes and extent of poverty,carries out effective assistance to fundamentally break the barriers of poverty and realize the overall poverty alleviation of the existing poor population by 2020.With the rapid development of China's economy,the level of national income has been seriously unbalanced.The extensive poverty-stricken regional-oriented poverty alleviation method has long since not applied to China.Under such circumstances,precision poverty alleviation came into being.Up to now,the main difficulties in precision poverty alleviation are “precise identification” “precision support” and “precise monitoring”.Accurate identification as the basis and the most important part of precision poverty alleviation,must be accurate enough.If poor households identify mistakes,there is no point in providing accurate support to poor households.In recent years,traditional poverty alleviation technologies and models have encountered difficulties,because the extensive poverty alleviation model is inefficient and it is difficult to identify who is a real poor household.Therefore,the traditional poverty alleviation model needs to be changed.At the same time,big data technology has developed rapidly in recent years and has been designated as a key development direction by the state.Through the organic integration of big data technology and precision poverty alleviation,we can study the mechanism of improving the performance of precision poverty alleviation,so as to further promote big data technology.Comprehensive application in the field of poverty alleviation and development.Using the knowledge of data mining to accurately identify the poor population under the big data computing framework Spark,compared with the traditional identification method.Obvious using massive samples instead of sampling samples has higher accuracy and persuasive power.It's also conducive to discovery the real cause of poverty and future help.This paper is based on Heilongjiang Province's precision poverty alleviation poverty-stricken village data and non-poor household data to generate 34 million sample data.The following work was done: Through the big data and data mining knowledge,the machine learning ML Pipeline module in the big data computing framework Spark was used to model and predict poor households.Firstly,data preprocessing is carried out to extract feature and feature transformation of poor household data,and then use the random forest algorithm,Logistic algorithm and the newly proposed waterfall model to construct the poverty household identification model.Finally,several models were compared and evaluated.The AUC averages of the three models were tested by 10 times of test set data.Then,the data of 100 real poor households are classified by three models to test the ability of the model to identify the data of real poor households.Finally,according to the AUC average of the three models,the accuracy of the three models for real poor household data and the construction time of the three models are used to evaluate the three poverty household identification models,finally draw conclusions.
Keywords/Search Tags:Data mining, Big Data, Spark, Targeted poverty alleviation
PDF Full Text Request
Related items