Post-processing Method Of Data Mining And Its Application

Posted on:2021-07-14

Degree:Master

Type:Thesis

Country:China

Candidate:S Wang

Full Text:PDF

GTID:2518306041961319

Subject:Software engineering theory and methods

Abstract/Summary:

PDF Full Text Request

With the rapid development of society and information circulation mode,Big data has an important impact on all aspects of our daily life,even national economic and social development.The era of big data has come.The fundamental purpose of data collection is to extract useful knowledge from data and apply it to specific fields,which also makes the discipline of data-mining gradually formed.However,in the field of computer,researchers are committed to improving algorithm efficiency.At the same time,researchers in other professional fields are still using basic data-mining methods because they can't understand complex data-mining methods.People in the computer field don't pay attention to the method of how to apply data mining results to practice,while others can't use it.Finally,the model structure of big data usually become flat and simple,and faults appear in the results of data mining and practical application.However,this problem can be solved by data post-processing.This thesis summarizes the research results in the field of data post-processing,and sums up the data post-processing framework from structure post-processing and content post-processing.This framework is compared with the traditional data processing methods through some data application examples.The research content of this thesis consists of the following three parts:(1)Based on the methods which transforming the structure of data mining results and finally form a new structure or a new model to show,this thesis considers a theoretical method to build a multi granularity data mining pattern framework.The SPCA model is optimized and a multi granularity data mining model is built based on it.This thesis examines how to form a knowledge base to support the precise poverty alleviation and solve the poverty problem step by step.At the same time,the DSPP method is used to improve the understanding of mining results.Finally,the new model is compared with those of the original SPCA model.(2)Based on the methods which optimizing the content of data mining results to get a more representative and concise results content,this thesis considers a filter feature selection method named consistency detection which can be used in this kind of postprocessing method.The new method combines the consistency measurement and dependency measurement.On the one hand,the consistency measurement is used to get the importance of features.On the other hand,the dependency measurement is used to calculate the correlation coefficient.The smaller the variance within the data class,the higher the consistency between the data and the classification.By analyzing the ratio of intra class variance and total variance,the consistency of the distribution in different categories can be measure.Finally,the consistency detection method proposed in this thesis is compared with Pearson correlation coefficient method in the accuracy of classification of MINST dataset under SGD algorithm.

Keywords/Search Tags:

Big data, Data-mining, Data post-processing, Feature selection

PDF Full Text Request

Related items

1	Data Mining And Feature Selection Of High Dimensional Biomedical Data Based On TCGA And Pubmed Databases
2	Design And Realization For Online Diagnoses System Based On Medical Data Mining
3	Research On Feature Weighting And Feature Selection-based Data Mining Algorithms
4	Research On Stratified Feature Selection Algorithms For High Dimension Data
5	Study And Implementation On Feature Selection Algorithms In Large Data Sets
6	Application Of Data Warehouse And Data Mining Technology In The Heilongjiang Post Business Decision Analysis System
7	Research On Models And Algorithms For Feature Selection On Dynamic Incomplete Data
8	Application Of Data Mining Technology At Post CRM
9	Spectral feature selection for mining ultrahigh dimensional data
10	Data Mining In The Postal Network Transport Analysis System