Font Size: a A A

Research And Implementation Of Visual Data Mining

Posted on:2009-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y YuFull Text:PDF
GTID:2178360272476645Subject:Computer software engineering
Abstract/Summary:PDF Full Text Request
Data mining is the process of extracting the potential, valuable knowledge from a lot of historical data. Visualization is the process of transferring the data, information and knowledge into visual form. Visual Data Mining technology with data mining technology and information visualization technology develops, it depicts the structure and functional data indicate that human perception and pattern of exceptions, tendencies and the ability to use visualization to enhance data mining, It provides both human and computer information processing system an interface.Effective use of visualization technology, we can quickly and efficiently dealing with a lot of data to find hidden features, patterns and trends, can guide a new and more efficient decision-making. Some data mining techniques and algorithms allow decision-makers to understand and use it. Visualization and data mining results can be more easily understood, and allow for more test results. Also it can be used to guide data mining algorithms to enable users to participate in the process of decision-making analysis. Visual Data Mining has a high value in visual data analysis and data mining techniques to explore large databases, particularly in very little understanding of the data and fuzzy exploring situation.This paper introduces the current situation and the development of the Visual Data Mining. It expounds the contents and significance of the Visual Data Mining.Then it introduces the methods of data visualization, result visualization and basic theory and the basic interactive operation of the Star coordinates. It also gives present clustering algorithms a simple summation, which K-means, DBSCAN, BIRCH and CURE clustering algorithm for a more detailed analysis.This paper studies focus mainly on the following aspects:1. Design and Implementation of a primitive data mining system's Visualization module, including visualization, drawing of some basic changes, data acquisition, files of data part.2. Design and Implementation of the tool that is a star coordinates visualization system, it includes: axis scaling, and rotation axis, graphics scaling and select some point color separation / unclassified data visualization, data observed values, results and other functions. Through the exchange operation, it can help users choose the appropriate clustering algorithm, DBSCAN exchange-visualization and data mining algorithms. Also the system can help users understand the data mining results.In the visual part, we use the Java, Java 2D and 3D rendering technology SWT to draw window, graphics and images. We adopt the data mining industry's internationally recognized standards and process model, including CRISP-DM and PMML standards, ensure compatible with other Mining tools and other providers products. Visual Data Mining and models are used with PMML standard data formats. Data document module is a document storage system as a whole, it is used to keep the system to handle data format. Data acquisition module is used for local documents obtained from the data show, including the type of text, database and XML document types. Basic drawing module includes drawing axis, Strip Profiles, the plot curves and the classification of the color of the map. Graphic changes module includes the rotation, zoom, partially retractable, and color graphics preservation operation. Standardization of data conversion module will be converted into various forms of data visualization graphics needs. This module provides several parts of the visual interface and operation of the public, which simplifies the visualization algorithms to achieve, while reducing redundancy, to a large extent, to improve scalability and maintainability of the system.This paper describes the design and implementation of a star coordinates tool, the functions include: axis scaling, and rotation axis, graphics scaling, and choose some spots be colored, separation or classification of data visualization, data observed values the results and other functions. Through the star coordinates,it can guide users to choose a suitable visual clustering algorithm. We use the UCI data sets, detailed analysis of the K-means, BIRCH, CURE and DBSCAN algorithm and appropriate data sets. This paper proposes the use of visualization solution for the overall parameters Eps DBSCAN algorithm sensitive issues and DBSCAN algorithm found it difficult to solve the interactive operations of the large difference in the density of clusters. Finally, we propose the use of the star coordinates can be effectively coordinate systems to help users understand the data mining results and found that the relationship between the multi-dimensional nature.Visual Data Mining Module is a key component of data mining system, it is dealing with the user interface and data mining system, which determines the appearance of the system, operational, interactive and user-friendly, it is the key to success is a data mining system. Multidimensional data visualization data mining is the key point of visualization. This paper presents the star-dimensional coordinate system which is an effective tool for data visualization. This paper proposes a three star coordinates the use of visual data mining and expands application areas for the star coordinates...
Keywords/Search Tags:Data Mining, Information Visualization, Star Coordinates, Clustering
PDF Full Text Request
Related items