Font Size: a A A

Visualization Of Data Mining Technology Research And Realization

Posted on:2008-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:W J LuoFull Text:PDF
GTID:2208360212999775Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the recent several decades, with the fast development of the computer hardware and software, especially the great advance in Internet techniques, the volume of the data which people have accumulated is now increasing very fast. The whole volume is so large that it is hard to find knowledge hidden in such a large data set. This is a problem being studied by many people nowadays. Data mining is one of the ways to solve this problem. In data mining, visualization plays an important role. Visualization in data mining lets us combine the virtue of human being's vision and domain knowledge with that of data mining. This combination makes the process of data mining intuitionistic and interactive, and thus gains more valuable and more understandable information. In this paper, we focus on the visualization techniques in data mining, and implement them in a Web based distributed data mining system, called MinerOnWeb.In this paper, we introduce the specific design and implementation of the data mining system MinerOnWeb. MinerOnWeb is a system which is designed to provide data mining service on line. It is constructed under the Struts framework according to J2EE criteria. It integrates a group of algorithms related to classification, cluster, and association mining. It is able to cope with data in several kinds of format. We mainly focus on two visualization methods and their implementation in MinerOnWeb:1) 2-dimensional histogram: Different from the traditional histogram, the x-axis of this histogram represents one dimension (attribute), while the y-axis represents the number of data records. In this way, we can find the distribution of this dimension. The color in the histogram denotes another dimension (attribute). Different colors distinguish different values of this dimension. Therefore, we can find the distribution of both dimensions and the relationship between them through this visualization.2) Scatter plots based on Star Coordinates: This is a method which projects multi-dimensional data to 2-dimensional plots. Every dimension is projected to an axis in the 2-dimensional plane. All the axes intersect at one original point to form a star coordinates. After being normalized by a max-min normalization, data are projected to the 2-dimensional plane by a mapping method calledα-mapping. The plot in the 2- dimension plane can be displayed in a dynamic way through changingαvalue, which makes it possible for users to observe data in different directions. As this visualization method is suitable for the visualization of cluster data, we research into an interactive visualized manual clustering method, and summarize two regulations used to find clusters. We then research in a cluster algorithm based interactive visualized clustering method, which takes advantage of both visualization techniques and automated clustering algorithms.
Keywords/Search Tags:Data Mining, Visualization, Star Coordinates, αmapping, Histogram, MinerOnWeb
PDF Full Text Request
Related items