Font Size: a A A

Research On The Improvement Of Data Mining Algorithm

Posted on:2020-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:X Y HuFull Text:PDF
GTID:2428330602957370Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
A large amount of data must contain a wealth of knowledge and high value.No matter what kind of connotation and extension big data has,such as huge volume,various types,rapid flow and low value density,its essential characteristics are the massive,high-dimensional,heterogeneous,dynamic,spatio-temporal,diverse,multi-source,multi-scale and fuzzy of data.Data mining technology is an important technical means to realize the transformation of data into knowledge and value.However,traditional data mining technology faces many challenges in order to mine the hidden rich knowledge and value from big data.The important way to solve the problem of big data mining is to research and design more efficient algorithms according to the essential characteristics of big data.At present,clustering algorithm and association rule algorithm are important research contents in big data mining technology.Among them,clustering algorithm refers to the algorithm process of grouping and processing similar and similar data objects in a large amount of data information,so that approximate data information can be gathered and clustered,so as to facilitate data mining and calculation.Clustering algorithm has been widely used in the global distribution pattern of data discovery objects,such as data analysis,market research,model evaluation,etc.The association rule algorithm mainly describes the internal correlation between a large number of data structures.This algorithm has been widely used in big data mining and analysis in the fields of earth science,meteorology,medicine and economics,making its data analysis more valuable and significant.To further improve the efficiency of the existing data mining algorithm,improve the result of data mining,based on the existing clustering algorithms and the defects and problems of the association rule algorithm,designed the two kinds of data mining algorithm for big data:one is based on the current CABWAD clustering effect is not good clustering algorithm,data processing is difficult,the shortcoming of the algorithm structure is not reasonable and problems,puts forward the improved CADD algorithm,and through the simulation experiment and contrast test,verify the validity of the algorithm and the clustering efficiency;two is based on the current clustering algorithm part of the problems existing in the Apriori algorithm,and algorithm parraieters setting problem,put forward the Apriori algorithm design,and using the geochemical data and clinical data for the two sets of association rules algorithm based on distance experiment,according to the original algorithm and improved algorithm contrast test,test the improved efficiency of the algorithm.
Keywords/Search Tags:Big data, Data mining, CAD algorithm, Apriori algorithm, Algorithm improvement
PDF Full Text Request
Related items