Font Size: a A A

Design And Implementation Of Plugin Technology-Based Data Mining Platform

Posted on:2009-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2178360242981352Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Highly developed database technology and widely used database management system caused more and more data accumulation. Presently, data collection, data searching, and data statistic could be achieved efficiently by using database system, but we could not find out the relations and rules behind them, nor to predict the future developing trend according to the data in existence. Lacking of mining the hiden knowledge of the data resulted in a data blast but poor knowledge phenomenon. Data mining system belongs to an intellectualized decision-making supported system. It integrates practical trade knowledge background and historical data, uses managing science, computer science and related theories and metheds, focuses on semi-structured and non-structured decision-making problems, and provides managers with knowledge and modules. It is a intellectual man-computer interaction information system which could help managers to make correct decisions. There are four statuses and three developing periods in the data mining system. Data mining system has become more and more consummate and standard through about twenty years development, and it has been widely used by many fields. It has not only that provided the enterprise with a decision-providing platform, but also a reliable tool for the data mining researchers. And this made unreplaceable progress of the data mining technology development.Eclipse is an open-source program ran by IBM and other seven companies. Plug-in technology is a distinctive feature. Its core inside is very small, all other functions are all plug-ins based on this core. Eclipse also opened its plug-in mechanism, and provided helpful plug-in developing environment to make users more easier to develop Eclipse plug-ins. We can say that Eclipse is a plug-in aggregation. Eclipse itself has been more over the concept of developing environment and become a general platform which allows as many as possible software to be integrated plug-ins, and this will be the future integration desk environment. As the same, we can write our application system into Eclipse plug-ins.The advantages of plug-in technologies can be seen apparently. In this paper, we could use this plug-in technology with such great advantages to design and build a data mining system. In the designing process of data mining system, plug-in technology has been used to realize the non-jointed combine of arithmetic logic modules with main plug-in, and practical arithmetic module with logic module. We could divide the whole system into several layers, and define its corresponding expanding point, so that each plug-in could be developed easily according to relevant require and this is helpful for programmer to spend more time on arithmetic developing.This paper is mainly about two aspects, one is to design and realize a data mining platform by using Eclipse plug-in technology, the other is to use plug-in technology to plug a database which is in the form of a plug-in into a system.During the designing of data mining platform, it is designed as three layers of frame structure. The first one is main program plug-in, which will realize system GUI designing, control mining arithmetic process and manage the configuration files. The second one is arithmetic logic plug-in, this plug-in will expand the expanding points of the main control program, fish the configuration of arithmetic logic parameter, and define its own expanding points for the third layer's expanding. The third layer plug-in is concrete arithmetic plug-in, which is compiled by programmer but qualified by plug-in frame. It is used to accomplish the design of concrete arithmetic. By using plug-in structure, we can say each plug-in is non-jointed combined with others, so that arithmetic programmer do not have to concern the system and focus their energy on arithmetic.In this paper, it is plug-in technology that we used to do the system module design, and we use work-stream method in process to actualize it.There are five main function parts of system work-stream core: 1. module role set. 2. project manage set. 3. work-stream engine platform. 4. file system. 5. data display set.System work-stream engine is run by central control program, which is also a main plug-in in plug-in system. User can use the project manage set of central control program to manage the projects, and drag role target into work zone to operate model position and parameter choose. Then, work-stream engine platform will take charge on these information and write them into configuration files.In addition, data mining system could visit local data and create a mining model. Later, it will analyze this model and process with this data collection. The model have to restored in unification standard, so that it would work in different data mining system engine. As a result, data mining system in our paper use PMML standard to restore mining module, and give the authority of result deployment to our final client.We divided the whole system into four modules according function realization.1.System GUI Module.A set of user-friendly interface is used by system, which one could access and control the arithmetic process easily and design the arithmetic process by simple dragging. It also actualize arithmetic parameter config interfaced and arithmetic results visual.2.Storage Manage Module.System has the function to store source data, temporary data and result data.3.Data Mining Arithmetic Actualize Module. System actualized several classic arithmetic: Apriori, ID3, C4.5, Naive Bayes.4.System Configuration Function Module.The realization of system configuration functions mainly depends on configuration files which store informations from each arithmetic node.Embedded database is an important component of embed system, and it becomes an essential and effective method for more and more individuation application development and management. Embedded database has been used widely through consumption digital gadget, portable computer device, enterprise real-time managing application, network store and management to all sorts of specified devices.Another aspect involved in this paper is to embed Derby database, which is in plug-in form, into data mining system.In this part, based on plug-in technology, open-source database Derby had been reorganized as a plug-in program, and pluged into the accomplished data mining system. In the designing process, we built a plug-in group, and reorganized database interface module. Embedded database plug-in group consist of two plug-ins, and they are derby.core and derby.ui. The function of derby.core is mainly about fundamental functions of Derby database, and of derby.ui is for the interactions on data mining system interface of Derby database. During the construction, Derby has been made into a plug-in first. It contains derby.jar and Workbench needed plugin.xml. Then put it into another plug-in in the joint part of system interface and database.In conclusion, this paper researched on the design and actualization of a data mining system based on Eclipse plug-in system, and embedded plug-in formed database Derby into system. All modules in this system come into non-jointed integration, and more important is that arithmetic programmer do not have to concern integration with system but to be absorbed in arithmetic designing.
Keywords/Search Tags:Technology-Based
PDF Full Text Request
Related items