Font Size: a A A

The Research Of Clustering Analysis In DataMining

Posted on:2004-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:J H GuoFull Text:PDF
GTID:2168360122460090Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Data mining is a relatively young research and application area based on Database techniques, which synthesizes multidisciplinary productions, such as logic, statistics, machine learning, fuzzy theory and visual computing, in order to acquire usable information from database . It has achieved increasing attention and broadly interest in the past years. Clustering analysis is an important part of the whole Data Mining system . Clustering is the process of grouping the data into classes or clusters so that objects within the same cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. Dissimilarities are assessed base on the attribute values describing the objects . Clustering analysis is the method which partition class to the clustered objects as required of thing's characteristics . Clustering processes are always carried out in the condition with no pre-known knowledge, so the most research task is to solve that how to get the clustering result in this premise .As the development of Data Mining, a number of clustering algorithms has been founded, In general, major clustering methods, can be classified into the following categories: Partitioning methods; Hierarchical methods; Density-based methods; Grid-based methods; Model-methods; besides these , some clustering algorithms integrate the ideas of several clustering methods . Although all these methods have got great achievement in different field, but these all meet diffculteis when processing huge quantity data base.Generally speaking, the data of a database impossible all accord with the model which acquired by classification or clustering analysis . Those data are called as outlier which not accord with the principle constituted by most data. Many DataMining method exclude the outlier before formally DataMining. But in some occasion, data caused by little probability has more Mining value than that of constantly things . The analysis to outliers normally called as outlier DataMining.Traditional clustering analysis sorts of mechanical classification, which strict partition every identification object to a class. That has a character of exclude each other, and so the boundary is clearly. But actually most object have not strict attribution, which have a uncertain attribution, and suit for flexible classification . Fuzzy clustering establish uncertain describe to classification . It can reflect reality more objective, thus become the main aspect of the clustering analysis. However, common Fuzzy clustering not suitable for large database andcan't satisfied occasion of highly actual time. In reality, it is popular of method in view of object function. this method has a simple design and a broad range to solve problem . It can transform to optimize problem and solved by non-linear program theory and easy to accomplish by computer.Therefore, Fuzzy clustering become the hot spot of clustering analysis as the development of the computer.
Keywords/Search Tags:DataMining, Clustering analysis, OutIier, Fuzzy clustering
PDF Full Text Request
Related items