Font Size: a A A

Research On Merton Effect-based Clustering Analysis Method

Posted on:2008-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:W XiaoFull Text:PDF
GTID:2178360242988976Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development and application of Computer hardware and software, the ability of human society to produce and obtain data is rapidly growing. So we are drowning in data yet starving for knowledge. People urgently need tools to convert data into knowledge automatically and rapidly. Then data mining came into being.Clustring analysis is an active research project in dada mining, which is an unsupervised learning and has been widely used. There have been many methods proposed for clustering, such as division method, hierarchical method, density based method, grid based method and model based method, but the most basic and important one is the agglomerative hierarchical clustering method, which has been proved by lots of study that can generate high-quality cluster.As for the study of agglomerative hierarchical clustering method, following work is accomplished.Firstly the present main clustering algorithm and their weakness are introduced. After studying the proximity measure for different type data, a mixed-type data's proximity measure method is proposed. It considers the standardization of data, variable's weight, asymmetrical attribute and attribute's value omission and so on.For the drawback of the present inter-cluster proximity measure, MEICD (Merton Effect - based Inter-cluster Distance) is proposed. Experiment showed that the inter-cluster proximity measure which considers the inter-cluster distance and cluster size can improve the clustering quality.For the difficulty to set the cluster number of given data set, particularly hard for high-dimensional data set, MHCA (MEICD-based Hierarchical Clustering Algorithm) is designed. It identifies the natural clusters visually and globally by using the association vector and descriptive function obtained in the clustering process, without resorting to external parameters. This algorithm can deal with mixed-attribute data and can recognize clusters with arbitrary shape and size even if there are outliers.Lastly the MHCA is applied to the e-business intelligent decision support system of Changjiang electronic Group Corp. A clustering module is inserted into the system. Based on the customer's herd purchase psychological phenomena, click-stream data and inner data of corporations is used to cluster the client. The clustering result can be used to guide the decision-maker and the client.
Keywords/Search Tags:unsupervised learning, hierarchical clustering, proximity measure, Merton Effect, association vector, descriptive function
PDF Full Text Request
Related items