Font Size: a A A

Research And Implementation Of Algorithms Recommendation Method Based On Parallel Data Mining Platform

Posted on:2014-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:L G ZhaoFull Text:PDF
GTID:2248330398972426Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the maturity and development of data mining technology, the number of data mining algorithms is increasing. Different data mining algorithms adapt to different mission scenes and user data, and appropriate mining algorithms selection directly affects the efficiency and quality of data mining work. The complexity and professionalism of the data mining technology requests the user to have comprehensive expertise to properly use and select the data mining algorithms. It is very difficult for the normal users. Automating the process of data mining algorithm selection has become a critical issue to be solved in the field of data mining research.In this paper, the author put forward an intelligent data mining algorithms recommendation model according to user’s mining task and data characteristics. The recommendation model is based on a specific parallel data mining platform. Starting with the characteristics of user data, this recommendation model established a self-adaptive algorithm performance knowledge library by combining with the algorithm performance knowledge and a large number of experiments. As an algorithm has similar performance on similar data sets, the recommendation model provide the user appropriate algorithms automatically by comparing the similarity between the user’s data set and the training data sets in the algorithm performance knowledge library, and it makes the algorithms selection really convenient.In this paper, the Big Cloud-Parallel Data Mining platform provided user interface, data mining algorithms and other components as resources. It extracted the characteristics of the user data by calculating the characteristic parameters of the user data. It compared the similarity between data sets by calculating the distance between data’s feature vectors. It constructed and maintained the algorithm performance knowledge library by defining the compositions, structures and operations. It evaluated and compared the performance of algorithms by calculating the performance index, such as accuracy, cohesion degree and so on. This data mining algorithms recommendation model has been tested on the specific data mining platform by a large number of experiments and it can provide algorithms selection service very well.
Keywords/Search Tags:data mining, algorithms recommendation, algorithm performanceknowledge, performance evaluate
PDF Full Text Request
Related items