Font Size: a A A

Post-processing Method Of Data Mining And Its Application

Posted on:2021-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:S WangFull Text:PDF
GTID:2518306041961319Subject:Software engineering theory and methods
Abstract/Summary:PDF Full Text Request
With the rapid development of society and information circulation mode,Big data has an important impact on all aspects of our daily life,even national economic and social development.The era of big data has come.The fundamental purpose of data collection is to extract useful knowledge from data and apply it to specific fields,which also makes the discipline of data-mining gradually formed.However,in the field of computer,researchers are committed to improving algorithm efficiency.At the same time,researchers in other professional fields are still using basic data-mining methods because they can't understand complex data-mining methods.People in the computer field don't pay attention to the method of how to apply data mining results to practice,while others can't use it.Finally,the model structure of big data usually become flat and simple,and faults appear in the results of data mining and practical application.However,this problem can be solved by data post-processing.This thesis summarizes the research results in the field of data post-processing,and sums up the data post-processing framework from structure post-processing and content post-processing.This framework is compared with the traditional data processing methods through some data application examples.The research content of this thesis consists of the following three parts:(1)Based on the methods which transforming the structure of data mining results and finally form a new structure or a new model to show,this thesis considers a theoretical method to build a multi granularity data mining pattern framework.The SPCA model is optimized and a multi granularity data mining model is built based on it.This thesis examines how to form a knowledge base to support the precise poverty alleviation and solve the poverty problem step by step.At the same time,the DSPP method is used to improve the understanding of mining results.Finally,the new model is compared with those of the original SPCA model.(2)Based on the methods which optimizing the content of data mining results to get a more representative and concise results content,this thesis considers a filter feature selection method named consistency detection which can be used in this kind of postprocessing method.The new method combines the consistency measurement and dependency measurement.On the one hand,the consistency measurement is used to get the importance of features.On the other hand,the dependency measurement is used to calculate the correlation coefficient.The smaller the variance within the data class,the higher the consistency between the data and the classification.By analyzing the ratio of intra class variance and total variance,the consistency of the distribution in different categories can be measure.Finally,the consistency detection method proposed in this thesis is compared with Pearson correlation coefficient method in the accuracy of classification of MINST dataset under SGD algorithm.
Keywords/Search Tags:Big data, Data-mining, Data post-processing, Feature selection
PDF Full Text Request
Related items