Font Size: a A A

Design And Prototype Implementation Of A Datamining Experiment Platform DMLab

Posted on:2007-08-24Degree:MasterType:Thesis
Country:ChinaCandidate:M ChenFull Text:PDF
GTID:2178360182495756Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Since data mining technology come into being, there were many sorts of data mining tools produced. Some of them provide interface for algorithm-experimentation and testing, however, they focused on mining tasks more, their main target was not on the algorithm developing, debugging and testing. Up to now, as for the scholar in data mining field, it is an inefficient job to implement and test algorithms all the same. So, the efficient integrated developing-testing environment called as Data Mining Laboratory(DMLab) was designed and developed for the researchers to implement, debug and test their own algorithms in a convient way, released them of designing and coding, and devote themselves to the research of the algorithms.DMLab is designed as a specific integrated developing environment for data mining researchers to code and test algorithms, and this tool unites the functions of data preparation, and implementation, debugging,evaluation of new algorithms. The data sever provided by DMLab can implement the accessing, analysis, exploration and preprocessing of dataset by simple handling, and the datasets can be used repeatedly and can span multi-nets, which improve the efficiency of data preparation and using process. Compare to other data mining tools, DMLab provides more efficient programming interface in virtue of the power function of Python, and DMLab has the unique functions of extensibility and facility, users can design and test their own data mining algorithms in a short minute. DMLab integrates the visualization module and intelligent evaluation module of the testing results, which let the evaluation procedure become more objective and simple.This paper introduces DMLab from four aspects, the system structure, the components of modules, integrating modes and function features, of stratified design pattern, modularization, the design rules of components, the flexible and extend features.Then, the design and prototype implementation procedure of DMLab aredescribed in detail. And the key techniques of DMLab implementing are introduced, for example, the implementation of basic data structures that contain some important class, global constants and their internal relations, the communication modes and protocol for dataset, and the implementation of graphic user interface.Finally, the testing of all the functions is carried out, which proved the expected main functions have been implemented. User can make use of DMLab to implement the procedure of dataset analysis, exploration, preprocessing, and editing, debugging the algorithm-script, configuring and running experimentation in childthread and so on, user can also extend their own data loader, preprocessing algorithms, datamining algorithms, and testing algorithms through the basic interface provided by DMLab.DMLab has an better ability of extend and adaptive, which may apply in many environments and fields. It has a good application future. On the other hand, it also has some drawbacks, and it is needed to improve in future. DMLab is only an attempt to design efficient platform for testing datamining-algorithms, it can provide experience when developing similar tools, all in one word,it will do more help on the research of datamining-algorithms.
Keywords/Search Tags:Data Mining, Algorithm Experiment, DMLab, Python
PDF Full Text Request
Related items