Font Size: a A A

Research And Implementation Of Visual Data Mining

Posted on:2009-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:H J LiFull Text:PDF
GTID:2178360242981599Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Data mining is the process of extracting the potential, valuable knowledge from a lot of historical data. Visualization is the process of transferring the data, information and knowledge into visual form. Visual data mining technology with data mining technology and information visualization technology develops, it depicts the structure and functional data indicate that human perception and pattern of exceptions, tendencies and the ability to use visualization to enhance data mining, It provides both human and computer information processing system an interface.Effective use of visualization technology, we can quickly and efficiently constitute approach between Data mining,system and user, which makes users make full use of their knowledge to restrict and manage the process of data mining, improve result of data mining, take part in the process of decision-making analysis. Visual data mining has a high value in visual data analysis and data mining techniques to explore large databases, particularly in very little understanding of the data and fuzzy exploring situation.This paper give a clear definition of visual data mining firstly, and expatiate the significance of visual data mining, make a summary three aspects that visualization take a active part in data mining: data visualization of data mining,process visualization of data mining and result visualization of data mining. Introduce current status of research,direction of developing in visual data mining and content of this paper at last.This paper takes data visualization technique and result visualization technique as the central content. Multidimensional data visualization in data mining is the key point of data visualization, it is the most important and the most complicated. This paper gives four popular multidimensional data visualization techniques, they are geometrical-transformed displays technique,iconic displays technique,stacked displays technique and dense pixel displays technique, introduce their basic principles,shape of representation and situation of use. This paper introduces various data to be visualized and interaction and distortion techniques. This paper also introduces concepts,basic principle and kind of association rules,classification and cluster simply, and their result visualization techniques each. these techniques come from some papers that have been published overseas or internal and some successful Business data mining system, such as MineSet,this part introduce their basic principles,shape of representation and characters.This paper designs a intelligent data mining platform system, introduces functions and roles of system framework,data file input,managing and controlling arithmetic of data mining and visualization. This paper also introduces eclipse platform and plugin technique, the result and models of data mining are used with PMML standard data formats. The plugin of this platform is designed into three layers. The system Framework is designed as a plugin, it is the first layer and running on eclipse. data file input,arithmetic and visualization are also designed as plugin running on the first layer plugin, they are the secondly layer plugin. Every material functions designed as plugin running on the secondly layer plugin, they are the thirdly layer plugin.A visualization module of intelligent data mining platform system is implemented in this paper, including data visualization and result visualization, it is universal and extensible. From design a program point of view, a visualization program is divide into data file,input and transforming of data,drawing and changing of basic graph and visualization,various technique is needed by every part. This paper uses some functions from the class of java.awt.Graphics to draw graph, and events of Button or Panel to change graph. For data visualization, this module implement line graph,bar charts,scatter graph,parallel coordinates,star glyphs,circle segments technique,multiple line graphs. Line graph is used to look into continuous one dimension data; bar charts is used to look into statistics of discrete one dimension data mostly; scatter graph observe distribution of two continuous two dimensions data and look for clustering,isolated point and trend; Parallel coordinates and star glyphs are fit for looking into small numbers of multiple dimensions data; circle segments graphs and multiple line graphs could look into large numbers of multiple dimensions data. Some of these glyphs could change by simple interactive operation to represent some characters. For result visualization, matrix graph is use to express one-to-one results of association rules, an improved parallel coordinates express rest results of association rules. A combination between bar charts and trees frame express rest result of naive bayes, A combination between stackable frame and trees frame expresses rest result of decision tree, A combination between pie graph and scatter graph expresses rest result of clustering, Some of these glyphs could remove by simple interactive operation, user also intercalate restrict to percolate data and transfer glyphs.
Keywords/Search Tags:Implementation
PDF Full Text Request
Related items